The patch titled
Subject: mm/slub: fix panic in slab_alloc_node()
has been added to the -mm tree. Its filename is
mm-slub-fix-panic-in-slab_alloc_node.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-slub-fix-panic-in-slab_alloc_n…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-slub-fix-panic-in-slab_alloc_n…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.ibm.com>
Subject: mm/slub: fix panic in slab_alloc_node()
While doing memory hot-unplug operation on a PowerPC VM running 1024 CPUs
with 11TB of ram, I hit the following panic:
BUG: Kernel NULL pointer dereference on read at 0x00000007
Faulting instruction address: 0xc000000000456048
Oops: Kernel access of bad area, sig: 11 [#2]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in: rpadlpar_io rpaphp
CPU: 160 PID: 1 Comm: systemd Tainted: G D 5.9.0 #1
NIP: c000000000456048 LR: c000000000455fd4 CTR: c00000000047b350
REGS: c00006028d1b77a0 TRAP: 0300 Tainted: G D (5.9.0)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24004228 XER: 00000000
CFAR: c00000000000f1b0 DAR: 0000000000000007 DSISR: 40000000 IRQMASK: 0
GPR00: c000000000455fd4 c00006028d1b7a30 c000000001bec800 0000000000000000
GPR04: 0000000000000dc0 0000000000000000 00000000000374ef c00007c53df99320
GPR08: 000007c53c980000 0000000000000000 000007c53c980000 0000000000000000
GPR12: 0000000000004400 c00000001e8e4400 0000000000000000 0000000000000f6a
GPR16: 0000000000000000 c000000001c25930 c000000001d62528 00000000000000c1
GPR20: c000000001d62538 c00006be469e9000 0000000fffffffe0 c0000000003c0ff8
GPR24: 0000000000000018 0000000000000000 0000000000000dc0 0000000000000000
GPR28: c00007c513755700 c000000001c236a4 c00007bc4001f800 0000000000000001
NIP [c000000000456048] __kmalloc_node+0x108/0x790
LR [c000000000455fd4] __kmalloc_node+0x94/0x790
Call Trace:
[c00006028d1b7a30] [c00007c51af92000] 0xc00007c51af92000 (unreliable)
[c00006028d1b7aa0] [c0000000003c0ff8] kvmalloc_node+0x58/0x110
[c00006028d1b7ae0] [c00000000047b45c] mem_cgroup_css_online+0x10c/0x270
[c00006028d1b7b30] [c000000000241fd8] online_css+0x48/0xd0
[c00006028d1b7b60] [c00000000024af14] cgroup_apply_control_enable+0x2c4/0x470
[c00006028d1b7c40] [c00000000024e838] cgroup_mkdir+0x408/0x5f0
[c00006028d1b7cb0] [c0000000005a4ef0] kernfs_iop_mkdir+0x90/0x100
[c00006028d1b7cf0] [c0000000004b8168] vfs_mkdir+0x138/0x250
[c00006028d1b7d40] [c0000000004baf04] do_mkdirat+0x154/0x1c0
[c00006028d1b7dc0] [c000000000032b38] system_call_exception+0xf8/0x200
[c00006028d1b7e20] [c00000000000c740] system_call_common+0xf0/0x27c
Instruction dump:
e93e0000 e90d0030 39290008 7cc9402a e94d0030 e93e0000 7ce95214 7f89502a
2fbc0000 419e0018 41920230 e9270010 <89290007> 7f994800 419e0220 7ee6bb78
This pointing to the following code:
mm/slub.c:2851
if (unlikely(!object || !node_match(page, node))) {
c000000000456038: 00 00 bc 2f cmpdi cr7,r28,0
c00000000045603c: 18 00 9e 41 beq cr7,c000000000456054 <__kmalloc_node+0x114>
node_match():
mm/slub.c:2491
if (node != NUMA_NO_NODE && page_to_nid(page) != node)
c000000000456040: 30 02 92 41 beq cr4,c000000000456270 <__kmalloc_node+0x330>
page_to_nid():
include/linux/mm.h:1294
c000000000456044: 10 00 27 e9 ld r9,16(r7)
c000000000456048: 07 00 29 89 lbz r9,7(r9) <<<< r9 = NULL
node_match():
mm/slub.c:2491
c00000000045604c: 00 48 99 7f cmpw cr7,r25,r9
c000000000456050: 20 02 9e 41 beq cr7,c000000000456270 <__kmalloc_node+0x330>
The panic occurred in slab_alloc_node() when checking for the page's node:
object = c->freelist;
page = c->page;
if (unlikely(!object || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
stat(s, ALLOC_SLOWPATH);
The issue is that object is not NULL while page is NULL which is odd but
may happen if the cache flush happened after loading object but before
loading page. Thus checking for the page pointer is required too.
The cache flush is done through an inter processor interrupt when a piece
of memory is off-lined. That interrupt is triggered when a memory
hot-unplug operation is initiated and offline_pages() is calling the
slub's MEM_GOING_OFFLINE callback slab_mem_going_offline_callback() which
is calling flush_cpu_slab(). If that interrupt is caught between the
reading of c->freelist and the reading of c->page, this could lead to such
a situation. That situation is expected and the later call to
this_cpu_cmpxchg_double() will detect the change to c->freelist and redo
the whole operation.
In commit 6159d0f5c03e ("mm/slub.c: page is always non-NULL in
node_match()") check on the page pointer has been removed assuming that
page is always valid when it is called. It happens that this is not true
in that particular case, so check for page before calling node_match()
here.
Link: https://lkml.kernel.org/r/20201027190406.33283-1-ldufour@linux.ibm.com
Fixes: 6159d0f5c03e ("mm/slub.c: page is always non-NULL in node_match()")
Signed-off-by: Laurent Dufour <ldufour(a)linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Nathan Lynch <nathanl(a)linux.ibm.com>
Cc: Scott Cheloha <cheloha(a)linux.ibm.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slub.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/slub.c~mm-slub-fix-panic-in-slab_alloc_node
+++ a/mm/slub.c
@@ -2852,7 +2852,7 @@ redo:
object = c->freelist;
page = c->page;
- if (unlikely(!object || !node_match(page, node))) {
+ if (unlikely(!object || !page || !node_match(page, node))) {
object = __slab_alloc(s, gfpflags, node, addr, c);
} else {
void *next_object = get_freepointer_safe(s, object);
_
Patches currently in -mm which might be from ldufour(a)linux.ibm.com are
mm-slub-fix-panic-in-slab_alloc_node.patch
On Wed, Oct 28, 2020 at 10:06:21AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:48:30PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.4.241 release.
> > There are 112 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:48:36 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 165 pass: 165 fail: 0
> Qemu test results:
> total: 332 pass: 332 fail: 0
>
Did anyone receive the original e-mail ? Looks like I have been tagged as
spammer, and I am having trouble sending e-mails.
Guenter
The patch titled
Subject: mm: always have io_remap_pfn_range() set pgprot_decrypted()
has been added to the -mm tree. Its filename is
mm-always-have-io_remap_pfn_range-set-pgprot_decrypted.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/mm-always-have-io_remap_pfn_range…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/mm-always-have-io_remap_pfn_range…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm: always have io_remap_pfn_range() set pgprot_decrypted()
The purpose of io_remap_pfn_range() is to map IO memory, such as a memory
mapped IO exposed through a PCI BAR. IO devices do not understand
encryption, so this memory must always be decrypted. Automatically call
pgprot_decrypted() as part of the generic implementation.
This fixes a bug where enabling AMD SME causes subsystems, such as RDMA,
using io_remap_pfn_range() to expose BAR pages to user space to fail. The
CPU will encrypt access to those BAR pages instead of passing unencrypted
IO directly to the device.
Places not mapping IO should use remap_pfn_range().
Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.com
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
CcK Arnd Bergmann <arnd(a)arndb.de>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: "Dave Young" <dyoung(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Larry Woodman <lwoodman(a)redhat.com>
Cc: Matt Fleming <matt(a)codeblueprint.co.uk>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: "Michael S. Tsirkin" <mst(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Toshimitsu Kani <toshi.kani(a)hpe.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 9 +++++++++
include/linux/pgtable.h | 4 ----
2 files changed, 9 insertions(+), 4 deletions(-)
--- a/include/linux/mm.h~mm-always-have-io_remap_pfn_range-set-pgprot_decrypted
+++ a/include/linux/mm.h
@@ -2759,6 +2759,15 @@ static inline vm_fault_t vmf_insert_page
return VM_FAULT_NOPAGE;
}
+#ifndef io_remap_pfn_range
+static inline int io_remap_pfn_range(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long pfn,
+ unsigned long size, pgprot_t prot)
+{
+ return remap_pfn_range(vma, addr, pfn, size, pgprot_decrypted(prot));
+}
+#endif
+
static inline vm_fault_t vmf_error(int err)
{
if (err == -ENOMEM)
--- a/include/linux/pgtable.h~mm-always-have-io_remap_pfn_range-set-pgprot_decrypted
+++ a/include/linux/pgtable.h
@@ -1427,10 +1427,6 @@ typedef unsigned int pgtbl_mod_mask;
#endif /* !__ASSEMBLY__ */
-#ifndef io_remap_pfn_range
-#define io_remap_pfn_range remap_pfn_range
-#endif
-
#ifndef has_transparent_hugepage
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define has_transparent_hugepage() 1
_
Patches currently in -mm which might be from jgg(a)nvidia.com are
mm-always-have-io_remap_pfn_range-set-pgprot_decrypted.patch
Retry
On Wed, Oct 28, 2020 at 10:12:08AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:44:10PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.9.2 release.
> > There are 757 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:52:54 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 154 pass: 154 fail: 0
> Qemu test results:
> total: 426 pass: 426 fail: 0
>
> Tested-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Guenter
Retry
On Wed, Oct 28, 2020 at 10:11:38AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:45:43PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.8.17 release.
> > There are 633 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:53:43 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 154 pass: 154 fail: 0
> Qemu test results:
> total: 426 pass: 426 fail: 0
>
> Tested-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Guenter
Retry.
On Wed, Oct 28, 2020 at 10:11:08AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:48:58PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 5.4.73 release.
> > There are 408 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:53:50 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 157 pass: 157 fail: 0
> Qemu test results:
> total: 426 pass: 426 fail: 0
>
> Tested-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Guenter
Retry.
On Wed, Oct 28, 2020 at 10:06:53AM -0700, Guenter Roeck wrote:
> On Tue, Oct 27, 2020 at 02:48:14PM +0100, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.241 release.
> > There are 139 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Thu, 29 Oct 2020 13:48:36 +0000.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 168 pass: 168 fail: 0
> Qemu test results:
> total: 386 pass: 386 fail: 0
>
> Tested-by: Guenter Roeck <linux(a)roeck-us.net>
>
> Guenter