The patch titled
Subject: mm/hmm: hmm_pfns_bad() was accessing wrong struct
has been added to the -mm tree. Its filename is
mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-hmm_pfns_bad-was-accessing-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-hmm_pfns_bad-was-accessing-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jérôme Glisse <jglisse(a)redhat.com>
Subject: mm/hmm: hmm_pfns_bad() was accessing wrong struct
The private field of mm_walk struct point to an hmm_vma_walk struct and
not to the hmm_range struct desired. Fix to get proper struct pointer.
Link: http://lkml.kernel.org/r/20180320020038.3360-6-jglisse@redhat.com
Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com>
Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: Mark Hairgrove <mhairgrove(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Balbir Singh <bsingharora(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Stephen Bates <sbates(a)raithlin.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN mm/hmm.c~mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct mm/hmm.c
--- a/mm/hmm.c~mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct
+++ a/mm/hmm.c
@@ -312,7 +312,8 @@ static int hmm_pfns_bad(unsigned long ad
unsigned long end,
struct mm_walk *walk)
{
- struct hmm_range *range = walk->private;
+ struct hmm_vma_walk *hmm_vma_walk = walk->private;
+ struct hmm_range *range = hmm_vma_walk->range;
hmm_pfn_t *pfns = range->pfns;
unsigned long i;
_
Patches currently in -mm which might be from jglisse(a)redhat.com are
mm-hmm-fix-header-file-if-else-endif-maze-v2.patch
mm-hmm-unregister-mmu_notifier-when-last-hmm-client-quit.patch
mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch
mm-hmm-use-struct-for-hmm_vma_fault-hmm_vma_get_pfns-parameters-v2.patch
mm-hmm-remove-hmm_pfn_read-flag-and-ignore-peculiar-architecture-v2.patch
mm-hmm-use-uint64_t-for-hmm-pfn-instead-of-defining-hmm_pfn_t-to-ulong-v2.patch
mm-hmm-cleanup-special-vma-handling-vm_special.patch
mm-hmm-do-not-differentiate-between-empty-entry-or-missing-directory-v2.patch
mm-hmm-rename-hmm_pfn_device_unaddressable-to-hmm_pfn_device_private.patch
mm-hmm-move-hmm_pfns_clear-closer-to-where-it-is-use.patch
mm-hmm-factor-out-pte-and-pmd-handling-to-simplify-hmm_vma_walk_pmd.patch
mm-hmm-change-hmm_vma_fault-to-allow-write-fault-on-page-basis.patch
mm-hmm-use-device-driver-encoding-for-hmm-pfn-v2.patch
The patch titled
Subject: mm/hmm: HMM should have a callback before MM is destroyed v2
has been added to the -mm tree. Its filename is
mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-hmm-should-have-a-callback-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-hmm-should-have-a-callback-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Ralph Campbell <rcampbell(a)nvidia.com>
Subject: mm/hmm: HMM should have a callback before MM is destroyed v2
The hmm_mirror_register() function registers a callback for when the CPU
pagetable is modified. Normally, the device driver will call
hmm_mirror_unregister() when the process using the device is finished.
However, if the process exits uncleanly, the struct_mm can be destroyed
with no warning to the device driver.
Link: http://lkml.kernel.org/r/20180320020038.3360-4-jglisse@redhat.com
Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com>
Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com>
Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com>
Cc: Mark Hairgrove <mhairgrove(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Balbir Singh <bsingharora(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Stephen Bates <sbates(a)raithlin.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN include/linux/hmm.h~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 include/linux/hmm.h
--- a/include/linux/hmm.h~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2
+++ a/include/linux/hmm.h
@@ -218,6 +218,16 @@ enum hmm_update_type {
* @update: callback to update range on a device
*/
struct hmm_mirror_ops {
+ /* release() - release hmm_mirror
+ *
+ * @mirror: pointer to struct hmm_mirror
+ *
+ * This is called when the mm_struct is being released.
+ * The callback should make sure no references to the mirror occur
+ * after the callback returns.
+ */
+ void (*release)(struct hmm_mirror *mirror);
+
/* sync_cpu_device_pagetables() - synchronize page tables
*
* @mirror: pointer to struct hmm_mirror
diff -puN mm/hmm.c~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 mm/hmm.c
--- a/mm/hmm.c~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2
+++ a/mm/hmm.c
@@ -160,6 +160,21 @@ static void hmm_invalidate_range(struct
up_read(&hmm->mirrors_sem);
}
+static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
+{
+ struct hmm *hmm = mm->hmm;
+ struct hmm_mirror *mirror;
+ struct hmm_mirror *mirror_next;
+
+ down_write(&hmm->mirrors_sem);
+ list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) {
+ list_del_init(&mirror->list);
+ if (mirror->ops->release)
+ mirror->ops->release(mirror);
+ }
+ up_write(&hmm->mirrors_sem);
+}
+
static void hmm_invalidate_range_start(struct mmu_notifier *mn,
struct mm_struct *mm,
unsigned long start,
@@ -185,6 +200,7 @@ static void hmm_invalidate_range_end(str
}
static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
+ .release = hmm_release,
.invalidate_range_start = hmm_invalidate_range_start,
.invalidate_range_end = hmm_invalidate_range_end,
};
@@ -230,7 +246,7 @@ void hmm_mirror_unregister(struct hmm_mi
struct hmm *hmm = mirror->hmm;
down_write(&hmm->mirrors_sem);
- list_del(&mirror->list);
+ list_del_init(&mirror->list);
up_write(&hmm->mirrors_sem);
}
EXPORT_SYMBOL(hmm_mirror_unregister);
_
Patches currently in -mm which might be from rcampbell(a)nvidia.com are
mm-hmm-documentation-editorial-update-to-hmm-documentation.patch
mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2.patch
The patch titled
Subject: mm/hmm: fix header file if/else/endif maze v2
has been added to the -mm tree. Its filename is
mm-hmm-fix-header-file-if-else-endif-maze-v2.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-fix-header-file-if-else-end…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-fix-header-file-if-else-end…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Jérôme Glisse <jglisse(a)redhat.com>
Subject: mm/hmm: fix header file if/else/endif maze v2
The #if/#else/#endif for IS_ENABLED(CONFIG_HMM) were wrong. Because of
this after multiple include there was multiple definition of both
hmm_mm_init() and hmm_mm_destroy() leading to build failure if HMM was
enabled (CONFIG_HMM set).
Link: http://lkml.kernel.org/r/20180320020038.3360-3-jglisse@redhat.com
Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com>
Acked-by: Balbir Singh <bsingharora(a)gmail.com>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Ralph Campbell <rcampbell(a)nvidia.com>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Logan Gunthorpe <logang(a)deltatee.com>
Cc: Mark Hairgrove <mhairgrove(a)nvidia.com>
Cc: Stephen Bates <sbates(a)raithlin.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN include/linux/hmm.h~mm-hmm-fix-header-file-if-else-endif-maze-v2 include/linux/hmm.h
--- a/include/linux/hmm.h~mm-hmm-fix-header-file-if-else-endif-maze-v2
+++ a/include/linux/hmm.h
@@ -498,23 +498,16 @@ struct hmm_device {
struct hmm_device *hmm_device_new(void *drvdata);
void hmm_device_put(struct hmm_device *hmm_device);
#endif /* CONFIG_DEVICE_PRIVATE || CONFIG_DEVICE_PUBLIC */
-#endif /* IS_ENABLED(CONFIG_HMM) */
/* Below are for HMM internal use only! Not to be used by device driver! */
-#if IS_ENABLED(CONFIG_HMM_MIRROR)
void hmm_mm_destroy(struct mm_struct *mm);
static inline void hmm_mm_init(struct mm_struct *mm)
{
mm->hmm = NULL;
}
-#else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-static inline void hmm_mm_destroy(struct mm_struct *mm) {}
-static inline void hmm_mm_init(struct mm_struct *mm) {}
-#endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-
-
#else /* IS_ENABLED(CONFIG_HMM) */
static inline void hmm_mm_destroy(struct mm_struct *mm) {}
static inline void hmm_mm_init(struct mm_struct *mm) {}
+#endif /* IS_ENABLED(CONFIG_HMM) */
#endif /* LINUX_HMM_H */
_
Patches currently in -mm which might be from jglisse(a)redhat.com are
mm-hmm-fix-header-file-if-else-endif-maze-v2.patch
mm-hmm-unregister-mmu_notifier-when-last-hmm-client-quit.patch
mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch
mm-hmm-use-struct-for-hmm_vma_fault-hmm_vma_get_pfns-parameters-v2.patch
mm-hmm-remove-hmm_pfn_read-flag-and-ignore-peculiar-architecture-v2.patch
mm-hmm-use-uint64_t-for-hmm-pfn-instead-of-defining-hmm_pfn_t-to-ulong-v2.patch
mm-hmm-cleanup-special-vma-handling-vm_special.patch
mm-hmm-do-not-differentiate-between-empty-entry-or-missing-directory-v2.patch
mm-hmm-rename-hmm_pfn_device_unaddressable-to-hmm_pfn_device_private.patch
mm-hmm-move-hmm_pfns_clear-closer-to-where-it-is-use.patch
mm-hmm-factor-out-pte-and-pmd-handling-to-simplify-hmm_vma_walk_pmd.patch
mm-hmm-change-hmm_vma_fault-to-allow-write-fault-on-page-basis.patch
mm-hmm-use-device-driver-encoding-for-hmm-pfn-v2.patch
The patch titled
Subject: mm/hugetlb: prevent hugetlb VMA to be misaligned
has been added to the -mm tree. Its filename is
mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-prevent-hugetlb-vma-to-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-prevent-hugetlb-vma-to-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Subject: mm/hugetlb: prevent hugetlb VMA to be misaligned
When running the sampler detailed below, the kernel, if built with the VM
debug option turned on (as many distro do), is panicing with the following
message:
kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310!
Oops: Exception in kernel mode, sig: 5 [#1]
LE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt
8<--8<--8<--8< snip 8<--8<--8<--8<
CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E
4.15.0-10-generic #11-Ubuntu
NIP: c00000000036e764 LR: c00000000036ee48 CTR: 0000000000000009
REGS: c000003fbcdcf810 TRAP: 0700 Tainted: G C E
(4.15.0-10-generic)
MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002222 XER:
20040000
CFAR: c00000000036ee44 SOFTE: 1
GPR00: c00000000036ee48 c000003fbcdcfa90 c0000000016ea600 c000003fbcdcfc40
GPR04: c000003fd9858950 00007115e4e00000 00007115e4e10000 0000000000000000
GPR08: 0000000000000010 0000000000010000 0000000000000000 0000000000000000
GPR12: 0000000000002000 c000000007a2c600 00000fe3985954d0 00007115e4e00000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 00000fe398595a94 000000000000a6fc c000003fd9858950 0000000000018554
GPR24: c000003fdcd84500 c0000000019acd00 00007115e4e10000 c000003fbcdcfc40
GPR28: 0000000000200000 00007115e4e00000 c000003fbc9ac600 c000003fd9858950
NIP [c00000000036e764] __unmap_hugepage_range+0xa4/0x760
LR [c00000000036ee48] __unmap_hugepage_range_final+0x28/0x50
Call Trace:
[c000003fbcdcfa90] [00007115e4e00000] 0x7115e4e00000 (unreliable)
[c000003fbcdcfb50] [c00000000036ee48]
__unmap_hugepage_range_final+0x28/0x50
[c000003fbcdcfb80] [c00000000033497c] unmap_single_vma+0x11c/0x190
[c000003fbcdcfbd0] [c000000000334e14] unmap_vmas+0x94/0x140
[c000003fbcdcfc20] [c00000000034265c] exit_mmap+0x9c/0x1d0
[c000003fbcdcfce0] [c000000000105448] mmput+0xa8/0x1d0
[c000003fbcdcfd10] [c00000000010fad0] do_exit+0x360/0xc80
[c000003fbcdcfdd0] [c0000000001104c0] do_group_exit+0x60/0x100
[c000003fbcdcfe10] [c000000000110584] SyS_exit_group+0x24/0x30
[c000003fbcdcfe30] [c00000000000b184] system_call+0x58/0x6c
Instruction dump:
552907fe e94a0028 e94a0408 eb2a0018 81590008 7f9c5036 0b090000 e9390010
7d2948f8 7d2a2838 0b0a0000 7d293038 <0b090000> e9230086 2fa90000 419e0468
===[ end trace ee88f958a1c62605 ]===
The panic is due to a VMA pointing to a hugetlb area while the
vma->vm_start or vma->vm_end field are not aligned to the huge page
boundaries. The sampler is just unmapping a part of the hugetlb area,
leading to 2 VMAs which are not well aligned. The same could be achieved
by calling madvise() situation, as it is when running: stress-ng
--shm-sysv 1
The hugetlb code is assuming that the VMA will be well aligned when it is
unmapped, so we must prevent such a VMA from bing split or shrunk to a
misaligned address.
This patch prevents this by checking the new VMA's boundaries when a VMA
is modified by calling vma_adjust().
=== Sampler used to hit the panic
nclude <sys/ipc.h>
unsigned long page_size;
int main(void)
{
int shmid, ret=1;
void *addr;
setbuf(stdout, NULL);
page_size = getpagesize();
shmid = shmget(0x1410, LENGTH, IPC_CREAT | SHM_HUGETLB | SHM_R |
SHM_W);
if (shmid < 0) {
perror("shmget");
exit(1);
}
printf("shmid: %d
", shmid);
addr = shmat(shmid, NULL, 0);
if (addr == (void*)-1) {
perror("shmat");
goto out;
}
/*
* The following munmap() call will split the VMA in 2, leading to
* unaligned to huge page size VMAs which will trigger a check when
* shmdt() is called.
*/
if (munmap(addr + HPSIZE + page_size, page_size)) {
perror("munmap");
goto out;
}
if (shmdt(addr)) {
perror("shmdt");
goto out;
}
printf("test done.
");
ret = 0;
out:
shmctl(shmid, IPC_RMID, NULL);
return ret;
}
=== End of code
Link: http://lkml.kernel.org/r/1521566754-30390-1-git-send-email-ldufour@linux.vn…
Signed-off-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
diff -puN mm/mmap.c~mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned mm/mmap.c
--- a/mm/mmap.c~mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned
+++ a/mm/mmap.c
@@ -692,6 +692,17 @@ int __vma_adjust(struct vm_area_struct *
long adjust_next = 0;
int remove_next = 0;
+ if (is_vm_hugetlb_page(vma)) {
+ /*
+ * We must check against the huge page boundarie to not
+ * create misaligned VMA.
+ */
+ struct hstate *h = hstate_vma(vma);
+
+ if (start & ~huge_page_mask(h) || end & ~huge_page_mask(h))
+ return -EINVAL;
+ }
+
if (next && !insert) {
struct vm_area_struct *exporter = NULL, *importer = NULL;
_
Patches currently in -mm which might be from ldufour(a)linux.vnet.ibm.com are
mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned.patch
kernel/locking/mutex.c: In function '__mutex_unlock_common_slowpath':
build/kernel/locking/mutex.c:721:2: error: implicit declaration of function 'WAKE_Q'
kernel/locking/mutex.c:721:9: error: 'wake_q' undeclared
This affects all architectures and all builds.
Guenter
If a task is holding a reference to a namespace on a removed controller,
the head will not be released. If the same controller is added again
later, its namespaces may not be successfully added. Instead, the user
will see kernel message "Duplicate IDs for nsid <X>".
This patch fixes that by skipping heads that don't have namespaces when
considering if a new namespace is safe to add.
Reported-by: Alex Gagniuc <Alex_Gagniuc(a)Dellteam.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Keith Busch <keith.busch(a)intel.com>
---
drivers/nvme/host/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 7aeca5db7916..0b9e60861e53 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2793,6 +2793,7 @@ static int __nvme_check_ids(struct nvme_subsystem *subsys,
list_for_each_entry(h, &subsys->nsheads, entry) {
if (nvme_ns_ids_valid(&new->ids) &&
+ !list_empty(&h->list) &&
nvme_ns_ids_equal(&new->ids, &h->ids))
return -EINVAL;
}
--
2.14.3
Hi,
unfortunately I missed message that the stable release queue repository
for 4.1.y changed. It looks like I am not copied on release announcements,
so I missed the last couple of releases. As a result, my builders did not
catch any changes for some time.
I just ran a test on v4.1.50 and got the following results.
Build results:
total: 139 pass: 133 fail: 6
Failed builds:
arm:axm55xx_defconfig
arm64:allmodconfig
i386:defconfig
i386:allyesconfig
i386:allmodconfig
i386:allnoconfig
Qemu test results:
total: 125 pass: 111 fail: 14
Failed tests:
arm64:virt:smp:defconfig:initrd
arm64:xlnx-zcu102:smp:defconfig:initrd:zynqmp-ep108
arm64:virt:nosmp:defconfig:initrd
arm64:xlnx-zcu102:nosmp:defconfig:initrd:zynqmp-ep108
x86:Broadwell:q35:x86_pc_defconfig
x86:Skylake-Client:q35:x86_pc_defconfig
x86:SandyBridge:q35:x86_pc_defconfig
x86:Haswell:pc:x86_pc_defconfig
x86:Nehalem:q35:x86_pc_defconfig
x86:phenom:pc:x86_pc_defconfig
x86:core2duo:q35:x86_pc_nosmp_defconfig
x86:Conroe:isapc:x86_pc_nosmp_defconfig
x86:Opteron_G1:pc:x86_pc_nosmp_defconfig
x86:n270:isapc:x86_pc_nosmp_defconfig
From the logs:
Building arm:axm55xx_defconfig ... failed
--------------
Error log:
arch/arm/kvm/handle_exit.c: In function 'handle_hvc':
arch/arm/kvm/handle_exit.c:48:3: error: implicit declaration of function 'vcpu_set_reg'
Building arm64:allmodconfig ... failed
--------------
Error log:
arch/arm64/kvm/handle_exit.c:45:3: error: implicit declaration of function ‘vcpu_set_reg’; did you mean ‘vcpu_sys_reg’?
Building i386:defconfig ... failed
--------------
Error log:
arch/x86/lib/checksum_32.S:32:31: fatal error: asm/nospec-branch.h
All other failures have the same cause.
Guenter
Hi Greg,
Hi Sasha,
commit 9b54d816e0042 ("blkcg: fix double free of new_blkg in blkcg_init_queue")
fixes CVE-2018-7480. See https://nvd.nist.gov/vuln/detail/CVE-2018-7480 for details.
Please apply to v4.1.y .. v4.9.y.
Thanks,
Guenter
Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter
dirty pages on the LRU") added flusher invocation to
shrink_inactive_list() when many dirty pages on the LRU are encountered.
However, shrink_inactive_list() doesn't wake up flushers for legacy
cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove
old flusher wakeup from direct reclaim path") removed the only source
of flusher's wake up in legacy mem cgroup reclaim path.
This leads to premature OOM if there is too many dirty pages in cgroup:
# mkdir /sys/fs/cgroup/memory/test
# echo $$ > /sys/fs/cgroup/memory/test/tasks
# echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes
# dd if=/dev/zero of=tmp_file bs=1M count=100
Killed
dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0
Call Trace:
dump_stack+0x46/0x65
dump_header+0x6b/0x2ac
oom_kill_process+0x21c/0x4a0
out_of_memory+0x2a5/0x4b0
mem_cgroup_out_of_memory+0x3b/0x60
mem_cgroup_oom_synchronize+0x2ed/0x330
pagefault_out_of_memory+0x24/0x54
__do_page_fault+0x521/0x540
page_fault+0x45/0x50
Task in /test killed as a result of limit of /test
memory: usage 51200kB, limit 51200kB, failcnt 73
memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0
kmem: usage 296kB, limit 9007199254740988kB, failcnt 0
Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB
mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB
active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB
Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child
Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB
oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Wake up flushers in legacy cgroup reclaim too.
Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path")
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: <stable(a)vger.kernel.org>
---
mm/vmscan.c | 31 ++++++++++++++++---------------
1 file changed, 16 insertions(+), 15 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 8fcd9f8d7390..4390a8d5be41 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1771,6 +1771,20 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
if (stat.nr_writeback && stat.nr_writeback == nr_taken)
set_bit(PGDAT_WRITEBACK, &pgdat->flags);
+ /*
+ * If dirty pages are scanned that are not queued for IO, it
+ * implies that flushers are not doing their job. This can
+ * happen when memory pressure pushes dirty pages to the end of
+ * the LRU before the dirty limits are breached and the dirty
+ * data has expired. It can also happen when the proportion of
+ * dirty pages grows not through writes but through memory
+ * pressure reclaiming all the clean cache. And in some cases,
+ * the flushers simply cannot keep up with the allocation
+ * rate. Nudge the flusher threads in case they are asleep.
+ */
+ if (stat.nr_unqueued_dirty == nr_taken)
+ wakeup_flusher_threads(WB_REASON_VMSCAN);
+
/*
* Legacy memcg will stall in page writeback so avoid forcibly
* stalling here.
@@ -1783,22 +1797,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested)
set_bit(PGDAT_CONGESTED, &pgdat->flags);
- /*
- * If dirty pages are scanned that are not queued for IO, it
- * implies that flushers are not doing their job. This can
- * happen when memory pressure pushes dirty pages to the end of
- * the LRU before the dirty limits are breached and the dirty
- * data has expired. It can also happen when the proportion of
- * dirty pages grows not through writes but through memory
- * pressure reclaiming all the clean cache. And in some cases,
- * the flushers simply cannot keep up with the allocation
- * rate. Nudge the flusher threads in case they are asleep, but
- * also allow kswapd to start writing pages during reclaim.
- */
- if (stat.nr_unqueued_dirty == nr_taken) {
- wakeup_flusher_threads(WB_REASON_VMSCAN);
+ /* Allow kswapd to start writing pages during reclaim. */
+ if (stat.nr_unqueued_dirty == nr_taken)
set_bit(PGDAT_DIRTY, &pgdat->flags);
- }
/*
* If kswapd scans pages marked marked for immediate
--
2.16.1