Commit 944e0fc51a89c9827b98813d65dc083274777c7f ("x86/amd: don't set
X86_BUG_SYSRET_SS_ATTRS when running under Xen") breaks Xen pv-domains
on AMD processors, as a prerequisite patch from upstream wasn't added
to 4.9.
Fix that by adding the prerequisite setting of X86_FEATURE_XENPV to the
Xen pv early boot path.
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
arch/x86/xen/enlighten.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
index 081437b5f381..674656cdb68c 100644
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1632,6 +1632,9 @@ asmlinkage __visible void __init xen_start_kernel(void)
xen_init_irq_ops();
xen_init_cpuid_mask();
+ /* Needed for init_amd(). */
+ setup_force_cpu_cap(X86_FEATURE_XENPV);
+
#ifdef CONFIG_X86_LOCAL_APIC
/*
* set up the basic apic ops.
--
2.13.6
The changes are to make sure to check the operation status.
Actually the flash write and erase error behavior is caused on our products.
The flash is Macronix flash device MX29GL512FHT2I-11G used by our products.
The patch series was separated for changes of flash write and erase.
Since those were not depended each other at the time.
But by additional changes the changes are related more as same way currently.
So combine patch series for the flash write and erase changes as v6.
Signed-off-by: Tokunori Ikegami <ikegami(a)allied-telesis.co.jp>
Reviewed-by: Joakim Tjernlund <Joakim.Tjernlund(a)infinera.com>
Cc: Chris Packham <chris.packham(a)alliedtelesis.co.nz>
Cc: Brian Norris <computersforpeace(a)gmail.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Cc: Marek Vasut <marek.vasut(a)gmail.com>
Cc: Richard Weinberger <richard(a)nod.at>
Cc: Cyrille Pitchen <cyrille.pitchen(a)wedev4u.fr>
Cc: linux-mtd(a)lists.infradead.org
Cc: stable(a)vger.kernel.org
Tokunori Ikegami (5):
mtd: cfi_cmdset_0002: Change write buffer to check correct value
mtd: cfi_cmdset_0002: Change definition naming to retry write
operation
mtd: cfi_cmdset_0002: Change erase functions to retry for error
mtd: cfi_cmdset_0002: Change erase functions to check chip good only
mtd: cfi_cmdset_0002: Change erase one block to enable XIP once
drivers/mtd/chips/cfi_cmdset_0002.c | 36 +++++++++++++++++++++++-------------
1 file changed, 23 insertions(+), 13 deletions(-)
--
2.16.1
Clear the PCR (Processor Compatibility Register) on boot to ensure we
are not running in a compatibility mode.
We've seen this cause problems when a crash (and kdump) occurs while
running compat mode guests. The kdump kernel then runs with the PCR
set and causes problems. The symptom in the kdump kernel (also seen in
petitboot after fast-reboot) is early userspace programs taking
sigills on newer instructions (seen in libc).
Signed-off-by: Michael Neuling <mikey(a)neuling.org>
Cc: stable(a)vger.kernel.org # v4.9
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
--
Greg, This is a backport for v4.9 only since the original patch didn't
apply.
Commit faf37c44a105f3608115785f17cbbf3500f8bc71 upstream.
---
arch/powerpc/kernel/cpu_setup_power.S | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/powerpc/kernel/cpu_setup_power.S b/arch/powerpc/kernel/cpu_setup_power.S
index 9e05c8828e..ff45d007d1 100644
--- a/arch/powerpc/kernel/cpu_setup_power.S
+++ b/arch/powerpc/kernel/cpu_setup_power.S
@@ -28,6 +28,7 @@ _GLOBAL(__setup_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
bl __init_LPCR
bl __init_tlb_power7
@@ -41,6 +42,7 @@ _GLOBAL(__restore_cpu_power7)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
bl __init_LPCR
bl __init_tlb_power7
@@ -57,6 +59,7 @@ _GLOBAL(__setup_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
bl __init_LPCR
@@ -78,6 +81,7 @@ _GLOBAL(__restore_cpu_power8)
beqlr
li r0,0
mtspr SPRN_LPID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
ori r3, r3, LPCR_PECEDH
bl __init_LPCR
@@ -98,6 +102,7 @@ _GLOBAL(__setup_cpu_power9)
li r0,0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
or r3, r3, r4
@@ -121,6 +126,7 @@ _GLOBAL(__restore_cpu_power9)
li r0,0
mtspr SPRN_LPID,r0
mtspr SPRN_PID,r0
+ mtspr SPRN_PCR,r0
mfspr r3,SPRN_LPCR
LOAD_REG_IMMEDIATE(r4, LPCR_PECEDH | LPCR_PECE_HVEE | LPCR_HVICE)
or r3, r3, r4
--
2.17.0
Hi Greg,
please queue:
1. 8a331f4a0863 ("x86/mce/AMD: Carve out SMCA get_block_address() code")
2. 78ce241099bb ("x86/MCE/AMD: Cache SMCA MISC block addresses")
in that order to
4.14
4.16
as it fixes an uninit mem read bug: https://bugzilla.kernel.org/show_bug.cgi?id=199851
Patches apply cleanly and build fine.
Thx.
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
'Commit cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during
shutdown")' has been added to kernel to shutdown pending PCIe port
service interrupts during reboot so that a newly started kexec kernel
wouldn't observe pending interrupts.
pcie_port_device_remove() is disabling the root port and switches by
calling pci_disable_device() after all PCIe service drivers are shutdown.
pci_disable_device() has a much wider impact then port service itself and
it prevents all inbound transactions to reach to the system and impacts
the entire PCI traffic behind the bridge.
Issue is that pcie_port_device_remove() doesn't maintain any coordination
with the rest of the PCI device drivers in the system before clearing the
bus master bit.
This has been found to cause crashes on HP DL360 Gen9 machines during
reboot. Besides, kexec is already clearing the bus master bit in
pci_device_shutdown() after all PCI drivers are removed.
Just remove the extra clear in shutdown path by seperating the remove and
shutdown APIs in the PORTDRV.
Signed-off-by: Sinan Kaya <okaya(a)codeaurora.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779
Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
Cc: stable(a)vger.kernel.org
Reported-by: Ryan Finnie <ryan(a)finnie.org>
---
drivers/pci/pcie/portdrv.h | 2 +-
drivers/pci/pcie/portdrv_core.c | 5 +++--
drivers/pci/pcie/portdrv_pci.c | 16 +++++++++++++---
3 files changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h
index d0c6783..f6e88fe 100644
--- a/drivers/pci/pcie/portdrv.h
+++ b/drivers/pci/pcie/portdrv.h
@@ -86,7 +86,7 @@ int pcie_port_device_register(struct pci_dev *dev);
int pcie_port_device_suspend(struct device *dev);
int pcie_port_device_resume(struct device *dev);
#endif
-void pcie_port_device_remove(struct pci_dev *dev);
+void pcie_port_device_remove(struct pci_dev *dev, bool disable);
int __must_check pcie_port_bus_register(void);
void pcie_port_bus_unregister(void);
diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c
index c9c0663..f35341e 100644
--- a/drivers/pci/pcie/portdrv_core.c
+++ b/drivers/pci/pcie/portdrv_core.c
@@ -405,11 +405,12 @@ static int remove_iter(struct device *dev, void *data)
* Remove PCI Express port service devices associated with given port and
* disable MSI-X or MSI for the port.
*/
-void pcie_port_device_remove(struct pci_dev *dev)
+void pcie_port_device_remove(struct pci_dev *dev, bool disable)
{
device_for_each_child(&dev->dev, NULL, remove_iter);
pci_free_irq_vectors(dev);
- pci_disable_device(dev);
+ if (disable)
+ pci_disable_device(dev);
}
/**
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 973f1b8..29afaf3 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -137,7 +137,7 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
return 0;
}
-static void pcie_portdrv_remove(struct pci_dev *dev)
+static void __pcie_portdrv_remove(struct pci_dev *dev, bool disable)
{
if (pci_bridge_d3_possible(dev)) {
pm_runtime_forbid(&dev->dev);
@@ -145,7 +145,17 @@ static void pcie_portdrv_remove(struct pci_dev *dev)
pm_runtime_dont_use_autosuspend(&dev->dev);
}
- pcie_port_device_remove(dev);
+ pcie_port_device_remove(dev, disable);
+}
+
+static void pcie_portdrv_remove(struct pci_dev *dev)
+{
+ return __pcie_portdrv_remove(dev, true);
+}
+
+static void pcie_portdrv_shutdown(struct pci_dev *dev)
+{
+ return __pcie_portdrv_remove(dev, false);
}
static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
@@ -218,7 +228,7 @@ static struct pci_driver pcie_portdriver = {
.probe = pcie_portdrv_probe,
.remove = pcie_portdrv_remove,
- .shutdown = pcie_portdrv_remove,
+ .shutdown = pcie_portdrv_shutdown,
.err_handler = &pcie_portdrv_err_handler,
--
2.7.4
It is up to a driver to implement shutdown() callback. If shutdown()
callback is not implemented, PCI device can have pending interrupt and
even do DMA transactions while the system is going down.
If kexec is in use, this can damage the newly booting kexec kernel
or even prevent it from booting altogether. Fallback to calling the
remove() callback if shutdown() isn't implemented for a given driver.
Signed-off-by: Sinan Kaya <okaya(a)codeaurora.org>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=199779
Fixes: cc27b735ad3a ("PCI/portdrv: Turn off PCIe services during shutdown")
Cc: stable(a)vger.kernel.org
Reported-by: Ryan Finnie <ryan(a)finnie.org>
---
drivers/pci/pci-driver.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index cbda0e6..75a00fe 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -477,8 +477,17 @@ static void pci_device_shutdown(struct device *dev)
pm_runtime_resume(dev);
+ /*
+ * Try shutdown callback if it exists, otherwise fallback to remove
+ * callback. PCI drivers can do DMA and have pending interrupts.
+ * Leaving the DMA and interrupts pending could damage the newly
+ * booting kexec kernel as well as prevent it from booting altogether
+ * if the pending interrupt is level.
+ */
if (drv && drv->shutdown)
drv->shutdown(pci_dev);
+ else if (drv && drv->remove)
+ drv->remove(pci_dev);
/*
* If this is a kexec reboot, turn off Bus Master bit on the
--
2.7.4
The patch titled
Subject: ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
has been added to the -mm tree. Its filename is
ocfs2-fix-a-misuse-a-of-brelse-after-failing-ocfs2_check_dir_entry.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-fix-a-misuse-a-of-brelse-aft…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/ocfs2-fix-a-misuse-a-of-brelse-aft…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Changwei Ge <ge.changwei(a)h3c.com>
Subject: ocfs2: fix a misuse a of brelse after failing ocfs2_check_dir_entry
Somehow, file system metadata was corrupted, which causes
ocfs2_check_dir_entry() to fail in function ocfs2_dir_foreach_blk_el().
According to the original design intention, if above happens we should
skip the problematic block and continue to retrieve dir entry. But there
is obviouse misuse of brelse around related code.
After failure of ocfs2_check_dir_entry(), currunt code just moves to next
position and uses the problematic buffer head again and again during which
the problematic buffer head is released for multiple times. I suppose,
this a serious issue which is long-lived in ocfs2. This may cause other
file systems which is also used in a the same host insane.
So we should also consider about bakcporting this patch into
linux -stable.
Link: http://lkml.kernel.org/r/HK2PR06MB045211675B43EED794E597B6D56E0@HK2PR06MB04…
Signed-off-by: Changwei Ge <ge.changwei(a)h3c.com>
Suggested-by: Changkuo Shi <shi.changkuo(a)h3c.com>
Cc: Mark Fasheh <mark(a)fasheh.com>
Cc: Joel Becker <jlbec(a)evilplan.org>
Cc: Junxiao Bi <junxiao.bi(a)oracle.com>
Cc: Joseph Qi <jiangqi903(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/ocfs2/dir.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff -puN fs/ocfs2/dir.c~ocfs2-fix-a-misuse-a-of-brelse-after-failing-ocfs2_check_dir_entry fs/ocfs2/dir.c
--- a/fs/ocfs2/dir.c~ocfs2-fix-a-misuse-a-of-brelse-after-failing-ocfs2_check_dir_entry
+++ a/fs/ocfs2/dir.c
@@ -1897,8 +1897,7 @@ static int ocfs2_dir_foreach_blk_el(stru
/* On error, skip the f_pos to the
next block. */
ctx->pos = (ctx->pos | (sb->s_blocksize - 1)) + 1;
- brelse(bh);
- continue;
+ break;
}
if (le64_to_cpu(de->inode)) {
unsigned char d_type = DT_UNKNOWN;
_
Patches currently in -mm which might be from ge.changwei(a)h3c.com are
ocfs2-fix-a-misuse-a-of-brelse-after-failing-ocfs2_check_dir_entry.patch
ocfs2-dont-put-and-assign-null-to-bh-allocated-outside.patch
ocfs2-dont-use-iocb-when-eiocbqueued-returns.patch
The patch titled
Subject: kasan: fix memory hotplug during boot
has been removed from the -mm tree. Its filename was
kasan-fix-memory-hotplug-during-boot.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: kasan: fix memory hotplug during boot
Using module_init() is wrong. E.g. ACPI adds and onlines memory before
our memory notifier gets registered.
This makes sure that ACPI memory detected during boot up will not result
in a kernel crash.
Easily reproducible with QEMU, just specify a DIMM when starting up.
Link: http://lkml.kernel.org/r/20180522100756.18478-3-david@redhat.com
Fixes: 786a8959912e ("kasan: disable memory hotplug")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN mm/kasan/kasan.c~kasan-fix-memory-hotplug-during-boot mm/kasan/kasan.c
--- a/mm/kasan/kasan.c~kasan-fix-memory-hotplug-during-boot
+++ a/mm/kasan/kasan.c
@@ -898,5 +898,5 @@ static int __init kasan_memhotplug_init(
return 0;
}
-module_init(kasan_memhotplug_init);
+core_initcall(kasan_memhotplug_init);
#endif
_
Patches currently in -mm which might be from david(a)redhat.com are
The patch titled
Subject: kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
has been removed from the -mm tree. Its filename was
kasan-free-allocated-shadow-memory-on-mem_cancel_online.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: David Hildenbrand <david(a)redhat.com>
Subject: kasan: free allocated shadow memory on MEM_CANCEL_ONLINE
We have to free memory again when we cancel onlining, otherwise a later
onlining attempt will fail.
Link: http://lkml.kernel.org/r/20180522100756.18478-2-david@redhat.com
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Signed-off-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.c | 1 +
1 file changed, 1 insertion(+)
diff -puN mm/kasan/kasan.c~kasan-free-allocated-shadow-memory-on-mem_cancel_online mm/kasan/kasan.c
--- a/mm/kasan/kasan.c~kasan-free-allocated-shadow-memory-on-mem_cancel_online
+++ a/mm/kasan/kasan.c
@@ -866,6 +866,7 @@ static int __meminit kasan_mem_notifier(
kmemleak_ignore(ret);
return NOTIFY_OK;
}
+ case MEM_CANCEL_ONLINE:
case MEM_OFFLINE: {
struct vm_struct *vm;
_
Patches currently in -mm which might be from david(a)redhat.com are
The patch titled
Subject: kernel/sys.c: fix potential Spectre v1 issue
has been removed from the -mm tree. Its filename was
kernel-sys-fix-potential-spectre-v1.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Gustavo A. R. Silva" <gustavo(a)embeddedor.com>
Subject: kernel/sys.c: fix potential Spectre v1 issue
`resource' can be controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential
spectre issue 'get_current()->signal->rlim' (local cap)
kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue
'get_current()->signal->rlim' (local cap)
Fix this by sanitizing *resource* before using it to index
current->signal->rlim
Notice that given that speculation windows are large, the policy is to
kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com
Signed-off-by: Gustavo A. R. Silva <gustavo(a)embeddedor.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Alexei Starovoitov <ast(a)kernel.org>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/sys.c | 5 +++++
1 file changed, 5 insertions(+)
diff -puN kernel/sys.c~kernel-sys-fix-potential-spectre-v1 kernel/sys.c
--- a/kernel/sys.c~kernel-sys-fix-potential-spectre-v1
+++ a/kernel/sys.c
@@ -71,6 +71,9 @@
#include <asm/io.h>
#include <asm/unistd.h>
+/* Hardening for Spectre-v1 */
+#include <linux/nospec.h>
+
#include "uid16.h"
#ifndef SET_UNALIGN_CTL
@@ -1453,6 +1456,7 @@ SYSCALL_DEFINE2(old_getrlimit, unsigned
if (resource >= RLIM_NLIMITS)
return -EINVAL;
+ resource = array_index_nospec(resource, RLIM_NLIMITS);
task_lock(current->group_leader);
x = current->signal->rlim[resource];
task_unlock(current->group_leader);
@@ -1472,6 +1476,7 @@ COMPAT_SYSCALL_DEFINE2(old_getrlimit, un
if (resource >= RLIM_NLIMITS)
return -EINVAL;
+ resource = array_index_nospec(resource, RLIM_NLIMITS);
task_lock(current->group_leader);
r = current->signal->rlim[resource];
task_unlock(current->group_leader);
_
Patches currently in -mm which might be from gustavo(a)embeddedor.com are
The patch titled
Subject: mm/kasan: don't vfree() nonexistent vm_area
has been removed from the -mm tree. Its filename was
mm-kasan-dont-vfree-nonexistent-vm_area.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Subject: mm/kasan: don't vfree() nonexistent vm_area
KASAN uses different routines to map shadow for hot added memory and
memory obtained in boot process. Attempt to offline memory onlined by
normal boot process leads to this:
Trying to vfree() nonexistent vm area (000000005d3b34b9)
WARNING: CPU: 2 PID: 13215 at mm/vmalloc.c:1525 __vunmap+0x147/0x190
Call Trace:
kasan_mem_notifier+0xad/0xb9
notifier_call_chain+0x166/0x260
__blocking_notifier_call_chain+0xdb/0x140
__offline_pages+0x96a/0xb10
memory_subsys_offline+0x76/0xc0
device_offline+0xb8/0x120
store_mem_state+0xfa/0x120
kernfs_fop_write+0x1d5/0x320
__vfs_write+0xd4/0x530
vfs_write+0x105/0x340
SyS_write+0xb0/0x140
Obviously we can't call vfree() to free memory that wasn't allocated via
vmalloc(). Use find_vm_area() to see if we can call vfree().
Unfortunately it's a bit tricky to properly unmap and free shadow
allocated during boot, so we'll have to keep it. If memory will come
online again that shadow will be reused.
Matthew asked: how can you call vfree() on something that isn't a
vmalloc address?
vfree() is able to free any address returned by
__vmalloc_node_range(). And __vmalloc_node_range() gives you any
address you ask. It doesn't have to be an address in [VMALLOC_START,
VMALLOC_END] range.
That's also how the module_alloc()/module_memfree() works on
architectures that have designated area for modules.
[aryabinin(a)virtuozzo.com: improve comments]
Link: http://lkml.kernel.org/r/dabee6ab-3a7a-51cd-3b86-5468718e0390@virtuozzo.com
[akpm(a)linux-foundation.org: fix typos, reflow comment]
Link: http://lkml.kernel.org/r/20180201163349.8700-1-aryabinin@virtuozzo.com
Fixes: fa69b5989bb0 ("mm/kasan: add support for memory hotplug")
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Reported-by: Paul Menzel <pmenzel+linux-kasan-dev(a)molgen.mpg.de>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/kasan/kasan.c | 63 +++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 61 insertions(+), 2 deletions(-)
diff -puN mm/kasan/kasan.c~mm-kasan-dont-vfree-nonexistent-vm_area mm/kasan/kasan.c
--- a/mm/kasan/kasan.c~mm-kasan-dont-vfree-nonexistent-vm_area
+++ a/mm/kasan/kasan.c
@@ -792,6 +792,40 @@ DEFINE_ASAN_SET_SHADOW(f5);
DEFINE_ASAN_SET_SHADOW(f8);
#ifdef CONFIG_MEMORY_HOTPLUG
+static bool shadow_mapped(unsigned long addr)
+{
+ pgd_t *pgd = pgd_offset_k(addr);
+ p4d_t *p4d;
+ pud_t *pud;
+ pmd_t *pmd;
+ pte_t *pte;
+
+ if (pgd_none(*pgd))
+ return false;
+ p4d = p4d_offset(pgd, addr);
+ if (p4d_none(*p4d))
+ return false;
+ pud = pud_offset(p4d, addr);
+ if (pud_none(*pud))
+ return false;
+
+ /*
+ * We can't use pud_large() or pud_huge(), the first one is
+ * arch-specific, the last one depends on HUGETLB_PAGE. So let's abuse
+ * pud_bad(), if pud is bad then it's bad because it's huge.
+ */
+ if (pud_bad(*pud))
+ return true;
+ pmd = pmd_offset(pud, addr);
+ if (pmd_none(*pmd))
+ return false;
+
+ if (pmd_bad(*pmd))
+ return true;
+ pte = pte_offset_kernel(pmd, addr);
+ return !pte_none(*pte);
+}
+
static int __meminit kasan_mem_notifier(struct notifier_block *nb,
unsigned long action, void *data)
{
@@ -813,6 +847,14 @@ static int __meminit kasan_mem_notifier(
case MEM_GOING_ONLINE: {
void *ret;
+ /*
+ * If shadow is mapped already than it must have been mapped
+ * during the boot. This could happen if we onlining previously
+ * offlined memory.
+ */
+ if (shadow_mapped(shadow_start))
+ return NOTIFY_OK;
+
ret = __vmalloc_node_range(shadow_size, PAGE_SIZE, shadow_start,
shadow_end, GFP_KERNEL,
PAGE_KERNEL, VM_NO_GUARD,
@@ -824,8 +866,25 @@ static int __meminit kasan_mem_notifier(
kmemleak_ignore(ret);
return NOTIFY_OK;
}
- case MEM_OFFLINE:
- vfree((void *)shadow_start);
+ case MEM_OFFLINE: {
+ struct vm_struct *vm;
+
+ /*
+ * shadow_start was either mapped during boot by kasan_init()
+ * or during memory online by __vmalloc_node_range().
+ * In the latter case we can use vfree() to free shadow.
+ * Non-NULL result of the find_vm_area() will tell us if
+ * that was the second case.
+ *
+ * Currently it's not possible to free shadow mapped
+ * during boot by kasan_init(). It's because the code
+ * to do that hasn't been written yet. So we'll just
+ * leak the memory.
+ */
+ vm = find_vm_area((void *)shadow_start);
+ if (vm)
+ vfree((void *)shadow_start);
+ }
}
return NOTIFY_OK;
_
Patches currently in -mm which might be from aryabinin(a)virtuozzo.com are
The patch titled
Subject: ipc/shm: fix shmat() nil address after round-down when remapping
has been removed from the -mm tree. Its filename was
ipc-shm-fix-shmat-nil-address-after-round-down-when-remapping.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Davidlohr Bueso <dave(a)stgolabs.net>
Subject: ipc/shm: fix shmat() nil address after round-down when remapping
shmat()'s SHM_REMAP option forbids passing a nil address for; this is in
fact the very first thing we check for. Andrea reported that for
SHM_RND|SHM_REMAP cases we can end up bypassing the initial addr check,
but we need to check again if the address was rounded down to nil. As of
this patch, such cases will return -EINVAL.
Link: http://lkml.kernel.org/r/20180503204934.kk63josdu6u53fbd@linux-n805
Signed-off-by: Davidlohr Bueso <dbueso(a)suse.de>
Reported-by: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Joe Lawrence <joe.lawrence(a)redhat.com>
Cc: Manfred Spraul <manfred(a)colorfullife.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
ipc/shm.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff -puN ipc/shm.c~ipc-shm-fix-shmat-nil-address-after-round-down-when-remapping ipc/shm.c
--- a/ipc/shm.c~ipc-shm-fix-shmat-nil-address-after-round-down-when-remapping
+++ a/ipc/shm.c
@@ -1363,9 +1363,17 @@ long do_shmat(int shmid, char __user *sh
if (addr) {
if (addr & (shmlba - 1)) {
- if (shmflg & SHM_RND)
+ if (shmflg & SHM_RND) {
addr &= ~(shmlba - 1); /* round down */
- else
+
+ /*
+ * Ensure that the round-down is non-nil
+ * when remapping. This can happen for
+ * cases when addr < shmlba.
+ */
+ if (!addr && (shmflg & SHM_REMAP))
+ goto out;
+ } else
#ifndef __ARCH_FORCE_SHMLBA
if (addr & ~PAGE_MASK)
#endif
_
Patches currently in -mm which might be from dave(a)stgolabs.net are
ipc-sem-mitigate-semnum-index-against-spectre-v1.patch
The patch titled
Subject: Revert "ipc/shm: Fix shmat mmap nil-page protection"
has been removed from the -mm tree. Its filename was
revert-ipc-shm-fix-shmat-mmap-nil-page-protection.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Davidlohr Bueso <dave(a)stgolabs.net>
Subject: Revert "ipc/shm: Fix shmat mmap nil-page protection"
Patch series "ipc/shm: shmat() fixes around nil-page".
These patches fix two issues reported[1] a while back by Joe and Andrea
around how shmat(2) behaves with nil-page.
The first reverts a commit that it was incorrectly thought that mapping
nil-page (address=0) was a no no with MAP_FIXED. This is not the case,
with the exception of SHM_REMAP; which is address in the second patch.
I chose two patches because it is easier to backport and it explicitly
reverts bogus behaviour. Both patches ought to be in -stable and ltp
testcases need updated (the added testcase around the cve can be modified
to just test for SHM_RND|SHM_REMAP).
[1] lkml.kernel.org/r/20180430172152.nfa564pvgpk3ut7p@linux-n805
This patch (of 2):
95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection") worked on the
idea that we should not be mapping as root addr=0 and MAP_FIXED. However,
it was reported that this scenario is in fact valid, thus making the patch
both bogus and breaks userspace as well. For example X11's libint10.so
relies on shmat(1, SHM_RND) for lowmem initialization[1].
[1] https://cgit.freedesktop.org/xorg/xserver/tree/hw/xfree86/os-support/linux/…
Link: http://lkml.kernel.org/r/20180503203243.15045-2-dave@stgolabs.net
Fixes: 95e91b831f87 ("ipc/shm: Fix shmat mmap nil-page protection")
Signed-off-by: Davidlohr Bueso <dbueso(a)suse.de>
Reported-by: Joe Lawrence <joe.lawrence(a)redhat.com>
Reported-by: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: Manfred Spraul <manfred(a)colorfullife.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
ipc/shm.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff -puN ipc/shm.c~revert-ipc-shm-fix-shmat-mmap-nil-page-protection ipc/shm.c
--- a/ipc/shm.c~revert-ipc-shm-fix-shmat-mmap-nil-page-protection
+++ a/ipc/shm.c
@@ -1363,13 +1363,8 @@ long do_shmat(int shmid, char __user *sh
if (addr) {
if (addr & (shmlba - 1)) {
- /*
- * Round down to the nearest multiple of shmlba.
- * For sane do_mmap_pgoff() parameters, avoid
- * round downs that trigger nil-page and MAP_FIXED.
- */
- if ((shmflg & SHM_RND) && addr >= shmlba)
- addr &= ~(shmlba - 1);
+ if (shmflg & SHM_RND)
+ addr &= ~(shmlba - 1); /* round down */
else
#ifndef __ARCH_FORCE_SHMLBA
if (addr & ~PAGE_MASK)
_
Patches currently in -mm which might be from dave(a)stgolabs.net are
ipc-sem-mitigate-semnum-index-against-spectre-v1.patch
The patch titled
Subject: idr: fix invalid ptr dereference on item delete
has been removed from the -mm tree. Its filename was
idr-fix-invalid-ptr-dereference-on-item-delete.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Matthew Wilcox <mawilcox(a)microsoft.com>
Subject: idr: fix invalid ptr dereference on item delete
If the radix tree underlying the IDR happens to be full and we attempt to
remove an id which is larger than any id in the IDR, we will call
__radix_tree_delete() with an uninitialised 'slot' pointer, at which point
anything could happen. This was easiest to hit with a single entry at id
0 and attempting to remove a non-0 id, but it could have happened with 64
entries and attempting to remove an id >= 64.
Roman said:
The syzcaller test boils down to opening /dev/kvm, creating an
eventfd, and calling a couple of KVM ioctls. None of this requires
superuser. And the result is dereferencing an uninitialized pointer
which is likely a crash. The specific path caught by syzbot is via
KVM_HYPERV_EVENTD ioctl which is new in 4.17. But I guess there are
other user-triggerable paths, so cc:stable is probably justified.
Matthew added:
We have around 250 calls to idr_remove() in the kernel today. Many
of them pass an ID which is embedded in the object they're removing,
so they're safe. Picking a few likely candidates:
drivers/firewire/core-cdev.c looks unsafe; the ID comes from an ioctl.
drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c is similar
drivers/atm/nicstar.c could be taken down by a handcrafted packet
Link: http://lkml.kernel.org/r/20180518175025.GD6361@bombadil.infradead.org
Fixes: 0a835c4f090a ("Reimplement IDR and IDA using the radix tree")
Reported-by: <syzbot+35666cba7f0a337e2e79(a)syzkaller.appspotmail.com>
Debugged-by: Roman Kagan <rkagan(a)virtuozzo.com>
Signed-off-by: Matthew Wilcox <mawilcox(a)microsoft.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/radix-tree.c | 4 +++-
tools/testing/radix-tree/idr-test.c | 7 +++++++
2 files changed, 10 insertions(+), 1 deletion(-)
diff -puN lib/radix-tree.c~idr-fix-invalid-ptr-dereference-on-item-delete lib/radix-tree.c
--- a/lib/radix-tree.c~idr-fix-invalid-ptr-dereference-on-item-delete
+++ a/lib/radix-tree.c
@@ -2034,10 +2034,12 @@ void *radix_tree_delete_item(struct radi
unsigned long index, void *item)
{
struct radix_tree_node *node = NULL;
- void __rcu **slot;
+ void __rcu **slot = NULL;
void *entry;
entry = __radix_tree_lookup(root, index, &node, &slot);
+ if (!slot)
+ return NULL;
if (!entry && (!is_idr(root) || node_tag_get(root, node, IDR_FREE,
get_slot_offset(node, slot))))
return NULL;
diff -puN tools/testing/radix-tree/idr-test.c~idr-fix-invalid-ptr-dereference-on-item-delete tools/testing/radix-tree/idr-test.c
--- a/tools/testing/radix-tree/idr-test.c~idr-fix-invalid-ptr-dereference-on-item-delete
+++ a/tools/testing/radix-tree/idr-test.c
@@ -252,6 +252,13 @@ void idr_checks(void)
idr_remove(&idr, 3);
idr_remove(&idr, 0);
+ assert(idr_alloc(&idr, DUMMY_PTR, 0, 0, GFP_KERNEL) == 0);
+ idr_remove(&idr, 1);
+ for (i = 1; i < RADIX_TREE_MAP_SIZE; i++)
+ assert(idr_alloc(&idr, DUMMY_PTR, 0, 0, GFP_KERNEL) == i);
+ idr_remove(&idr, 1 << 30);
+ idr_destroy(&idr);
+
for (i = INT_MAX - 3UL; i < INT_MAX + 1UL; i++) {
struct item *item = item_create(i, 0);
assert(idr_alloc(&idr, item, i, i + 10, GFP_KERNEL) == i);
_
Patches currently in -mm which might be from mawilcox(a)microsoft.com are
slab-__gfp_zero-is-incompatible-with-a-constructor.patch
s390-use-_refcount-for-pgtables.patch
mm-split-page_type-out-from-_mapcount.patch
mm-mark-pages-in-use-for-page-tables.patch
mm-switch-s_mem-and-slab_cache-in-struct-page.patch
mm-move-private-union-within-struct-page.patch
mm-move-_refcount-out-of-struct-page-union.patch
mm-combine-first-three-unions-in-struct-page.patch
mm-use-page-deferred_list.patch
mm-move-lru-union-within-struct-page.patch
mm-combine-lru-and-main-union-in-struct-page.patch
mm-improve-struct-page-documentation.patch
mm-add-pt_mm-to-struct-page.patch
mm-add-hmm_data-to-struct-page.patch
slabslub-remove-rcu_head-size-checks.patch
slub-remove-kmem_cache-reserved.patch
slub-remove-reserved-file-from-sysfs.patch
ida-remove-simple_ida_lock.patch