March 2018 - Linux-stable-mirror

+ mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/hmm: hmm_pfns_bad() was accessing wrong struct has been added to the -mm tree. Its filename is mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-hmm_pfns_bad-was-accessing-… and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-hmm_pfns_bad-was-accessing-… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Jérôme Glisse <jglisse(a)redhat.com> Subject: mm/hmm: hmm_pfns_bad() was accessing wrong struct The private field of mm_walk struct point to an hmm_vma_walk struct and not to the hmm_range struct desired. Fix to get proper struct pointer. Link: http://lkml.kernel.org/r/20180320020038.3360-6-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com> Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com> Cc: Ralph Campbell <rcampbell(a)nvidia.com> Cc: Mark Hairgrove <mhairgrove(a)nvidia.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: Balbir Singh <bsingharora(a)gmail.com> Cc: Jason Gunthorpe <jgg(a)mellanox.com> Cc: Logan Gunthorpe <logang(a)deltatee.com> Cc: Stephen Bates <sbates(a)raithlin.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- diff -puN mm/hmm.c~mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct mm/hmm.c --- a/mm/hmm.c~mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct +++ a/mm/hmm.c @@ -312,7 +312,8 @@ static int hmm_pfns_bad(unsigned long ad unsigned long end, struct mm_walk *walk) { - struct hmm_range *range = walk->private; + struct hmm_vma_walk *hmm_vma_walk = walk->private; + struct hmm_range *range = hmm_vma_walk->range; hmm_pfn_t *pfns = range->pfns; unsigned long i; _ Patches currently in -mm which might be from jglisse(a)redhat.com are mm-hmm-fix-header-file-if-else-endif-maze-v2.patch mm-hmm-unregister-mmu_notifier-when-last-hmm-client-quit.patch mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch mm-hmm-use-struct-for-hmm_vma_fault-hmm_vma_get_pfns-parameters-v2.patch mm-hmm-remove-hmm_pfn_read-flag-and-ignore-peculiar-architecture-v2.patch mm-hmm-use-uint64_t-for-hmm-pfn-instead-of-defining-hmm_pfn_t-to-ulong-v2.patch mm-hmm-cleanup-special-vma-handling-vm_special.patch mm-hmm-do-not-differentiate-between-empty-entry-or-missing-directory-v2.patch mm-hmm-rename-hmm_pfn_device_unaddressable-to-hmm_pfn_device_private.patch mm-hmm-move-hmm_pfns_clear-closer-to-where-it-is-use.patch mm-hmm-factor-out-pte-and-pmd-handling-to-simplify-hmm_vma_walk_pmd.patch mm-hmm-change-hmm_vma_fault-to-allow-write-fault-on-page-basis.patch mm-hmm-use-device-driver-encoding-for-hmm-pfn-v2.patch

7 years, 3 months

1
0
0 0

+ mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/hmm: HMM should have a callback before MM is destroyed v2 has been added to the -mm tree. Its filename is mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-hmm-should-have-a-callback-… and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-hmm-should-have-a-callback-… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Ralph Campbell <rcampbell(a)nvidia.com> Subject: mm/hmm: HMM should have a callback before MM is destroyed v2 The hmm_mirror_register() function registers a callback for when the CPU pagetable is modified. Normally, the device driver will call hmm_mirror_unregister() when the process using the device is finished. However, if the process exits uncleanly, the struct_mm can be destroyed with no warning to the device driver. Link: http://lkml.kernel.org/r/20180320020038.3360-4-jglisse@redhat.com Signed-off-by: Ralph Campbell <rcampbell(a)nvidia.com> Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com> Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com> Cc: Mark Hairgrove <mhairgrove(a)nvidia.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: Balbir Singh <bsingharora(a)gmail.com> Cc: Jason Gunthorpe <jgg(a)mellanox.com> Cc: Logan Gunthorpe <logang(a)deltatee.com> Cc: Stephen Bates <sbates(a)raithlin.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- diff -puN include/linux/hmm.h~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 include/linux/hmm.h --- a/include/linux/hmm.h~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 +++ a/include/linux/hmm.h @@ -218,6 +218,16 @@ enum hmm_update_type { * @update: callback to update range on a device */ struct hmm_mirror_ops { + /* release() - release hmm_mirror + * + * @mirror: pointer to struct hmm_mirror + * + * This is called when the mm_struct is being released. + * The callback should make sure no references to the mirror occur + * after the callback returns. + */ + void (*release)(struct hmm_mirror *mirror); + /* sync_cpu_device_pagetables() - synchronize page tables * * @mirror: pointer to struct hmm_mirror diff -puN mm/hmm.c~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 mm/hmm.c --- a/mm/hmm.c~mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2 +++ a/mm/hmm.c @@ -160,6 +160,21 @@ static void hmm_invalidate_range(struct up_read(&hmm->mirrors_sem); } +static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm) +{ + struct hmm *hmm = mm->hmm; + struct hmm_mirror *mirror; + struct hmm_mirror *mirror_next; + + down_write(&hmm->mirrors_sem); + list_for_each_entry_safe(mirror, mirror_next, &hmm->mirrors, list) { + list_del_init(&mirror->list); + if (mirror->ops->release) + mirror->ops->release(mirror); + } + up_write(&hmm->mirrors_sem); +} + static void hmm_invalidate_range_start(struct mmu_notifier *mn, struct mm_struct *mm, unsigned long start, @@ -185,6 +200,7 @@ static void hmm_invalidate_range_end(str } static const struct mmu_notifier_ops hmm_mmu_notifier_ops = { + .release = hmm_release, .invalidate_range_start = hmm_invalidate_range_start, .invalidate_range_end = hmm_invalidate_range_end, }; @@ -230,7 +246,7 @@ void hmm_mirror_unregister(struct hmm_mi struct hmm *hmm = mirror->hmm; down_write(&hmm->mirrors_sem); - list_del(&mirror->list); + list_del_init(&mirror->list); up_write(&hmm->mirrors_sem); } EXPORT_SYMBOL(hmm_mirror_unregister); _ Patches currently in -mm which might be from rcampbell(a)nvidia.com are mm-hmm-documentation-editorial-update-to-hmm-documentation.patch mm-hmm-hmm-should-have-a-callback-before-mm-is-destroyed-v2.patch

7 years, 3 months

1
0
0 0

+ mm-hmm-fix-header-file-if-else-endif-maze-v2.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/hmm: fix header file if/else/endif maze v2 has been added to the -mm tree. Its filename is mm-hmm-fix-header-file-if-else-endif-maze-v2.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hmm-fix-header-file-if-else-end… and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hmm-fix-header-file-if-else-end… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Jérôme Glisse <jglisse(a)redhat.com> Subject: mm/hmm: fix header file if/else/endif maze v2 The #if/#else/#endif for IS_ENABLED(CONFIG_HMM) were wrong. Because of this after multiple include there was multiple definition of both hmm_mm_init() and hmm_mm_destroy() leading to build failure if HMM was enabled (CONFIG_HMM set). Link: http://lkml.kernel.org/r/20180320020038.3360-3-jglisse@redhat.com Signed-off-by: Jérôme Glisse <jglisse(a)redhat.com> Acked-by: Balbir Singh <bsingharora(a)gmail.com> Cc: Andrew Morton <akpm(a)linux-foundation.org> Cc: Ralph Campbell <rcampbell(a)nvidia.com> Cc: John Hubbard <jhubbard(a)nvidia.com> Cc: Evgeny Baskakov <ebaskakov(a)nvidia.com> Cc: Jason Gunthorpe <jgg(a)mellanox.com> Cc: Logan Gunthorpe <logang(a)deltatee.com> Cc: Mark Hairgrove <mhairgrove(a)nvidia.com> Cc: Stephen Bates <sbates(a)raithlin.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- diff -puN include/linux/hmm.h~mm-hmm-fix-header-file-if-else-endif-maze-v2 include/linux/hmm.h --- a/include/linux/hmm.h~mm-hmm-fix-header-file-if-else-endif-maze-v2 +++ a/include/linux/hmm.h @@ -498,23 +498,16 @@ struct hmm_device { struct hmm_device *hmm_device_new(void *drvdata); void hmm_device_put(struct hmm_device *hmm_device); #endif /* CONFIG_DEVICE_PRIVATE || CONFIG_DEVICE_PUBLIC */ -#endif /* IS_ENABLED(CONFIG_HMM) */ /* Below are for HMM internal use only! Not to be used by device driver! */ -#if IS_ENABLED(CONFIG_HMM_MIRROR) void hmm_mm_destroy(struct mm_struct *mm); static inline void hmm_mm_init(struct mm_struct *mm) { mm->hmm = NULL; } -#else /* IS_ENABLED(CONFIG_HMM_MIRROR) */ -static inline void hmm_mm_destroy(struct mm_struct *mm) {} -static inline void hmm_mm_init(struct mm_struct *mm) {} -#endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */ - - #else /* IS_ENABLED(CONFIG_HMM) */ static inline void hmm_mm_destroy(struct mm_struct *mm) {} static inline void hmm_mm_init(struct mm_struct *mm) {} +#endif /* IS_ENABLED(CONFIG_HMM) */ #endif /* LINUX_HMM_H */ _ Patches currently in -mm which might be from jglisse(a)redhat.com are mm-hmm-fix-header-file-if-else-endif-maze-v2.patch mm-hmm-unregister-mmu_notifier-when-last-hmm-client-quit.patch mm-hmm-hmm_pfns_bad-was-accessing-wrong-struct.patch mm-hmm-use-struct-for-hmm_vma_fault-hmm_vma_get_pfns-parameters-v2.patch mm-hmm-remove-hmm_pfn_read-flag-and-ignore-peculiar-architecture-v2.patch mm-hmm-use-uint64_t-for-hmm-pfn-instead-of-defining-hmm_pfn_t-to-ulong-v2.patch mm-hmm-cleanup-special-vma-handling-vm_special.patch mm-hmm-do-not-differentiate-between-empty-entry-or-missing-directory-v2.patch mm-hmm-rename-hmm_pfn_device_unaddressable-to-hmm_pfn_device_private.patch mm-hmm-move-hmm_pfns_clear-closer-to-where-it-is-use.patch mm-hmm-factor-out-pte-and-pmd-handling-to-simplify-hmm_vma_walk_pmd.patch mm-hmm-change-hmm_vma_fault-to-allow-write-fault-on-page-basis.patch mm-hmm-use-device-driver-encoding-for-hmm-pfn-v2.patch

7 years, 3 months

1
0
0 0

+ mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned.patch added to -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/hugetlb: prevent hugetlb VMA to be misaligned has been added to the -mm tree. Its filename is mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned.patch This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-hugetlb-prevent-hugetlb-vma-to-… and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-hugetlb-prevent-hugetlb-vma-to-… Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/SubmitChecklist when testing your code *** The -mm tree is included into linux-next and is updated there every 3-4 working days ------------------------------------------------------ From: Laurent Dufour <ldufour(a)linux.vnet.ibm.com> Subject: mm/hugetlb: prevent hugetlb VMA to be misaligned When running the sampler detailed below, the kernel, if built with the VM debug option turned on (as many distro do), is panicing with the following message: kernel BUG at /build/linux-jWa1Fv/linux-4.15.0/mm/hugetlb.c:3310! Oops: Exception in kernel mode, sig: 5 [#1] LE SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: kcm nfc af_alg caif_socket caif phonet fcrypt 8<--8<--8<--8< snip 8<--8<--8<--8< CPU: 18 PID: 43243 Comm: trinity-subchil Tainted: G C E 4.15.0-10-generic #11-Ubuntu NIP: c00000000036e764 LR: c00000000036ee48 CTR: 0000000000000009 REGS: c000003fbcdcf810 TRAP: 0700 Tainted: G C E (4.15.0-10-generic) MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002222 XER: 20040000 CFAR: c00000000036ee44 SOFTE: 1 GPR00: c00000000036ee48 c000003fbcdcfa90 c0000000016ea600 c000003fbcdcfc40 GPR04: c000003fd9858950 00007115e4e00000 00007115e4e10000 0000000000000000 GPR08: 0000000000000010 0000000000010000 0000000000000000 0000000000000000 GPR12: 0000000000002000 c000000007a2c600 00000fe3985954d0 00007115e4e00000 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 00000fe398595a94 000000000000a6fc c000003fd9858950 0000000000018554 GPR24: c000003fdcd84500 c0000000019acd00 00007115e4e10000 c000003fbcdcfc40 GPR28: 0000000000200000 00007115e4e00000 c000003fbc9ac600 c000003fd9858950 NIP [c00000000036e764] __unmap_hugepage_range+0xa4/0x760 LR [c00000000036ee48] __unmap_hugepage_range_final+0x28/0x50 Call Trace: [c000003fbcdcfa90] [00007115e4e00000] 0x7115e4e00000 (unreliable) [c000003fbcdcfb50] [c00000000036ee48] __unmap_hugepage_range_final+0x28/0x50 [c000003fbcdcfb80] [c00000000033497c] unmap_single_vma+0x11c/0x190 [c000003fbcdcfbd0] [c000000000334e14] unmap_vmas+0x94/0x140 [c000003fbcdcfc20] [c00000000034265c] exit_mmap+0x9c/0x1d0 [c000003fbcdcfce0] [c000000000105448] mmput+0xa8/0x1d0 [c000003fbcdcfd10] [c00000000010fad0] do_exit+0x360/0xc80 [c000003fbcdcfdd0] [c0000000001104c0] do_group_exit+0x60/0x100 [c000003fbcdcfe10] [c000000000110584] SyS_exit_group+0x24/0x30 [c000003fbcdcfe30] [c00000000000b184] system_call+0x58/0x6c Instruction dump: 552907fe e94a0028 e94a0408 eb2a0018 81590008 7f9c5036 0b090000 e9390010 7d2948f8 7d2a2838 0b0a0000 7d293038 <0b090000> e9230086 2fa90000 419e0468 ===[ end trace ee88f958a1c62605 ]=== The panic is due to a VMA pointing to a hugetlb area while the vma->vm_start or vma->vm_end field are not aligned to the huge page boundaries. The sampler is just unmapping a part of the hugetlb area, leading to 2 VMAs which are not well aligned. The same could be achieved by calling madvise() situation, as it is when running: stress-ng --shm-sysv 1 The hugetlb code is assuming that the VMA will be well aligned when it is unmapped, so we must prevent such a VMA from bing split or shrunk to a misaligned address. This patch prevents this by checking the new VMA's boundaries when a VMA is modified by calling vma_adjust(). === Sampler used to hit the panic nclude <sys/ipc.h> unsigned long page_size; int main(void) { int shmid, ret=1; void *addr; setbuf(stdout, NULL); page_size = getpagesize(); shmid = shmget(0x1410, LENGTH, IPC_CREAT | SHM_HUGETLB | SHM_R | SHM_W); if (shmid < 0) { perror("shmget"); exit(1); } printf("shmid: %d ", shmid); addr = shmat(shmid, NULL, 0); if (addr == (void*)-1) { perror("shmat"); goto out; } /* * The following munmap() call will split the VMA in 2, leading to * unaligned to huge page size VMAs which will trigger a check when * shmdt() is called. */ if (munmap(addr + HPSIZE + page_size, page_size)) { perror("munmap"); goto out; } if (shmdt(addr)) { perror("shmdt"); goto out; } printf("test done. "); ret = 0; out: shmctl(shmid, IPC_RMID, NULL); return ret; } === End of code Link: http://lkml.kernel.org/r/1521566754-30390-1-git-send-email-ldufour@linux.vn… Signed-off-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com> Cc: Michal Hocko <mhocko(a)kernel.org> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- diff -puN mm/mmap.c~mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned mm/mmap.c --- a/mm/mmap.c~mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned +++ a/mm/mmap.c @@ -692,6 +692,17 @@ int __vma_adjust(struct vm_area_struct * long adjust_next = 0; int remove_next = 0; + if (is_vm_hugetlb_page(vma)) { + /* + * We must check against the huge page boundarie to not + * create misaligned VMA. + */ + struct hstate *h = hstate_vma(vma); + + if (start & ~huge_page_mask(h) || end & ~huge_page_mask(h)) + return -EINVAL; + } + if (next && !insert) { struct vm_area_struct *exporter = NULL, *importer = NULL; _ Patches currently in -mm which might be from ldufour(a)linux.vnet.ibm.com are mm-hugetlb-prevent-hugetlb-vma-to-be-misaligned.patch

7 years, 3 months

1
0
0 0

All builds broken in v4.1.y.queue

by Guenter Roeck

kernel/locking/mutex.c: In function '__mutex_unlock_common_slowpath': build/kernel/locking/mutex.c:721:2: error: implicit declaration of function 'WAKE_Q' kernel/locking/mutex.c:721:9: error: 'wake_q' undeclared This affects all architectures and all builds. Guenter

7 years, 3 months

1
0
0 0

[PATCH] nvme: Skip checking heads without namespaces

by Keith Busch

If a task is holding a reference to a namespace on a removed controller, the head will not be released. If the same controller is added again later, its namespaces may not be successfully added. Instead, the user will see kernel message "Duplicate IDs for nsid <X>". This patch fixes that by skipping heads that don't have namespaces when considering if a new namespace is safe to add. Reported-by: Alex Gagniuc <Alex_Gagniuc(a)Dellteam.com> Cc: stable(a)vger.kernel.org Signed-off-by: Keith Busch <keith.busch(a)intel.com> --- drivers/nvme/host/core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 7aeca5db7916..0b9e60861e53 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2793,6 +2793,7 @@ static int __nvme_check_ids(struct nvme_subsystem *subsys, list_for_each_entry(h, &subsys->nsheads, entry) { if (nvme_ns_ids_valid(&new->ids) && + !list_empty(&h->list) && nvme_ns_ids_equal(&new->ids, &h->ids)) return -EINVAL; } -- 2.14.3

7 years, 3 months

3
3
0 0

Build/test failures in v4.1.50

by Guenter Roeck

Hi, unfortunately I missed message that the stable release queue repository for 4.1.y changed. It looks like I am not copied on release announcements, so I missed the last couple of releases. As a result, my builders did not catch any changes for some time. I just ran a test on v4.1.50 and got the following results. Build results: total: 139 pass: 133 fail: 6 Failed builds: arm:axm55xx_defconfig arm64:allmodconfig i386:defconfig i386:allyesconfig i386:allmodconfig i386:allnoconfig Qemu test results: total: 125 pass: 111 fail: 14 Failed tests: arm64:virt:smp:defconfig:initrd arm64:xlnx-zcu102:smp:defconfig:initrd:zynqmp-ep108 arm64:virt:nosmp:defconfig:initrd arm64:xlnx-zcu102:nosmp:defconfig:initrd:zynqmp-ep108 x86:Broadwell:q35:x86_pc_defconfig x86:Skylake-Client:q35:x86_pc_defconfig x86:SandyBridge:q35:x86_pc_defconfig x86:Haswell:pc:x86_pc_defconfig x86:Nehalem:q35:x86_pc_defconfig x86:phenom:pc:x86_pc_defconfig x86:core2duo:q35:x86_pc_nosmp_defconfig x86:Conroe:isapc:x86_pc_nosmp_defconfig x86:Opteron_G1:pc:x86_pc_nosmp_defconfig x86:n270:isapc:x86_pc_nosmp_defconfig From the logs: Building arm:axm55xx_defconfig ... failed -------------- Error log: arch/arm/kvm/handle_exit.c: In function 'handle_hvc': arch/arm/kvm/handle_exit.c:48:3: error: implicit declaration of function 'vcpu_set_reg' Building arm64:allmodconfig ... failed -------------- Error log: arch/arm64/kvm/handle_exit.c:45:3: error: implicit declaration of function ‘vcpu_set_reg’; did you mean ‘vcpu_sys_reg’? Building i386:defconfig ... failed -------------- Error log: arch/x86/lib/checksum_32.S:32:31: fatal error: asm/nospec-branch.h All other failures have the same cause. Guenter

7 years, 3 months

2
1
0 0

Please apply 9b54d816e0042 ("blkcg: fix double free...") to v4.1.y..v4.9.y

by Guenter Roeck

Hi Greg, Hi Sasha, commit 9b54d816e0042 ("blkcg: fix double free of new_blkg in blkcg_init_queue") fixes CVE-2018-7480. See https://nvd.nist.gov/vuln/detail/CVE-2018-7480 for details. Please apply to v4.1.y .. v4.9.y. Thanks, Guenter

7 years, 3 months

3
2
0 0

[PATCH 1/6] mm/vmscan: Wake up flushers for legacy cgroups too

by Andrey Ryabinin

Commit 726d061fbd36 ("mm: vmscan: kick flushers when we encounter dirty pages on the LRU") added flusher invocation to shrink_inactive_list() when many dirty pages on the LRU are encountered. However, shrink_inactive_list() doesn't wake up flushers for legacy cgroup reclaim, so the next commit bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") removed the only source of flusher's wake up in legacy mem cgroup reclaim path. This leads to premature OOM if there is too many dirty pages in cgroup: # mkdir /sys/fs/cgroup/memory/test # echo $$ > /sys/fs/cgroup/memory/test/tasks # echo 50M > /sys/fs/cgroup/memory/test/memory.limit_in_bytes # dd if=/dev/zero of=tmp_file bs=1M count=100 Killed dd invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null), order=0, oom_score_adj=0 Call Trace: dump_stack+0x46/0x65 dump_header+0x6b/0x2ac oom_kill_process+0x21c/0x4a0 out_of_memory+0x2a5/0x4b0 mem_cgroup_out_of_memory+0x3b/0x60 mem_cgroup_oom_synchronize+0x2ed/0x330 pagefault_out_of_memory+0x24/0x54 __do_page_fault+0x521/0x540 page_fault+0x45/0x50 Task in /test killed as a result of limit of /test memory: usage 51200kB, limit 51200kB, failcnt 73 memory+swap: usage 51200kB, limit 9007199254740988kB, failcnt 0 kmem: usage 296kB, limit 9007199254740988kB, failcnt 0 Memory cgroup stats for /test: cache:49632KB rss:1056KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:49500KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:1168KB inactive_file:24760KB active_file:24960KB unevictable:0KB Memory cgroup out of memory: Kill process 3861 (bash) score 88 or sacrifice child Killed process 3876 (dd) total-vm:8484kB, anon-rss:1052kB, file-rss:1720kB, shmem-rss:0kB oom_reaper: reaped process 3876 (dd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB Wake up flushers in legacy cgroup reclaim too. Fixes: bbef938429f5 ("mm: vmscan: remove old flusher wakeup from direct reclaim path") Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com> Cc: <stable(a)vger.kernel.org> --- mm/vmscan.c | 31 ++++++++++++++++--------------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 8fcd9f8d7390..4390a8d5be41 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1771,6 +1771,20 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, if (stat.nr_writeback && stat.nr_writeback == nr_taken) set_bit(PGDAT_WRITEBACK, &pgdat->flags); + /* + * If dirty pages are scanned that are not queued for IO, it + * implies that flushers are not doing their job. This can + * happen when memory pressure pushes dirty pages to the end of + * the LRU before the dirty limits are breached and the dirty + * data has expired. It can also happen when the proportion of + * dirty pages grows not through writes but through memory + * pressure reclaiming all the clean cache. And in some cases, + * the flushers simply cannot keep up with the allocation + * rate. Nudge the flusher threads in case they are asleep. + */ + if (stat.nr_unqueued_dirty == nr_taken) + wakeup_flusher_threads(WB_REASON_VMSCAN); + /* * Legacy memcg will stall in page writeback so avoid forcibly * stalling here. @@ -1783,22 +1797,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, if (stat.nr_dirty && stat.nr_dirty == stat.nr_congested) set_bit(PGDAT_CONGESTED, &pgdat->flags); - /* - * If dirty pages are scanned that are not queued for IO, it - * implies that flushers are not doing their job. This can - * happen when memory pressure pushes dirty pages to the end of - * the LRU before the dirty limits are breached and the dirty - * data has expired. It can also happen when the proportion of - * dirty pages grows not through writes but through memory - * pressure reclaiming all the clean cache. And in some cases, - * the flushers simply cannot keep up with the allocation - * rate. Nudge the flusher threads in case they are asleep, but - * also allow kswapd to start writing pages during reclaim. - */ - if (stat.nr_unqueued_dirty == nr_taken) { - wakeup_flusher_threads(WB_REASON_VMSCAN); + /* Allow kswapd to start writing pages during reclaim. */ + if (stat.nr_unqueued_dirty == nr_taken) set_bit(PGDAT_DIRTY, &pgdat->flags); - } /* * If kswapd scans pages marked marked for immediate -- 2.16.1

7 years, 3 months

3
2
0 0

Re: 4.14.28 crash on tcp_push

by Willy Tarreau

Hi Pavlos, On Tue, Mar 20, 2018 at 01:01:38PM +0100, Pavlos Parissis wrote: > Hi, > > We were upgrading a production system from 4.14.20 to 4.14.28 and we got the following crash and I > was wondering if anyone has seen similar crash: > > [ 346.435832] BUG: unable to handle kernel NULL pointer dereference at 0000000000000038 > [ 346.473216] IP: tcp_push+0x42/0x120 > [ 346.489607] PGD 8000001838949067 P4D 8000001838949067 PUD 183894a067 PMD 0 > [ 346.523318] Oops: 0002 [#1] SMP PTI > [ 346.540395] Modules linked in: sctp_diag sctp dccp_diag dccp udp_diag unix_diag tcp_diag > inet_diag 8021q garp mrp input_leds joydev xfs libcrc32c loop vfat fat x86_pkg_temp_thermal > intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel > pcbc aesni_intel iTCO_wdt crypto_simd glue_helper cryptd iTCO_vendor_support intel_cstate lpc_ich > intel_rapl_perf mfd_core hpwdt i2c_i801 hpilo pcspkr wmi sg ipmi_si ipmi_devintf ipmi_msghandler > acpi_power_meter shpchp ioatdma ip_tables ext4 mbcache jbd2 mgag200 i2c_algo_bit drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops sd_mod ttm crc32c_intel ixgbe mdio hpsa tg3 i40e drm > dca ptp pps_core scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod dax > [ 346.854574] CPU: 5 PID: 1533 Comm: carbon-submissi Not tainted 4.14.28-1.el7.x86_64 #1 > [ 346.892452] Hardware name: HP ProLiant DL380 Gen9/ProLiant DL380 Gen9, BIOS P89 04/25/2017 > [ 346.931641] task: ffff88183806c5c0 task.stack: ffffc90007ea8000 > [ 346.959768] RIP: 0010:tcp_push+0x42/0x120 > [ 346.978914] RSP: 0018:ffffc90007eabc78 EFLAGS: 00010246 > [ 347.004199] RAX: 0000000000000000 RBX: 00000000000000c2 RCX: 0000000000000001 > [ 347.038684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88184ad0c800 > [ 347.073236] RBP: ffffc90007eabc78 R08: 000000000000ffcb R09: 0000000000000257 > [ 347.108070] R10: ffff88184ad0c958 R11: 000000000000ffcb R12: 00000000ffffffe0 > [ 347.142006] R13: 00000000ffffffe0 R14: ffff88184ad0c800 R15: ffff88184ad0c958 > [ 347.176290] FS: 00007fbad3ff7700(0000) GS:ffff880c4fd40000(0000) knlGS:0000000000000000 > [ 347.215545] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 347.243091] CR2: 0000000000000038 CR3: 0000001838bd4004 CR4: 00000000001606e0 > [ 347.276950] Call Trace: > [ 347.288526] tcp_sendmsg_locked+0x118/0xe50 > [ 347.308321] tcp_sendmsg+0x2c/0x50 > [ 347.324517] inet_sendmsg+0x37/0xb0 > [ 347.341379] sock_sendmsg+0x3e/0x50 > [ 347.358018] sock_write_iter+0x85/0xf0 > [ 347.376095] __vfs_write+0xfb/0x160 > [ 347.392961] vfs_write+0xb2/0x1b0 > [ 347.408915] ? syscall_trace_enter+0x1cd/0x2b0 > [ 347.430458] SyS_write+0x55/0xc0 > [ 347.446047] do_syscall_64+0x79/0x1b0 > [ 347.463757] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 > [ 347.488102] RIP: 0033:0x7fbae295a6ad > [ 347.505100] RSP: 002b:00007fbad3ff6e60 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 > [ 347.541196] RAX: ffffffffffffffda RBX: 00000000000000c7 RCX: 00007fbae295a6ad > [ 347.575649] RDX: 00000000000000c7 RSI: 00007fbabc06ab60 RDI: 0000000000000013 > [ 347.609762] RBP: 000000000000000a R08: 00007fbabc06ab60 R09: 00000000022a58f0 > [ 347.643653] R10: 0000000000001a05 R11: 0000000000000293 R12: 00007fbabc06ab60 > [ 347.677684] R13: 000000000200f040 R14: 00000000022a1840 R15: 00000000000000cb > [ 347.712207] Code: 48 8b 87 60 01 00 00 4c 8d 97 58 01 00 00 41 89 d3 ba 00 00 00 00 49 39 c2 48 > 0f 44 c2 89 f2 81 e2 00 80 00 00 0f 85 af 00 00 00 <80> 48 38 08 44 8b 8f 74 06 00 00 44 89 8f 7c 06 > 00 00 83 e6 01 > [ 347.803312] RIP: tcp_push+0x42/0x120 RSP: ffffc90007eabc78 > [ 347.829666] CR2: 0000000000000038 > [ 347.845805] ---[ end trace 031807a627822772 ]--- > [ 347.873681] Kernel panic - not syncing: Fatal exception > [ 347.898899] Kernel Offset: disabled > [ 347.920580] Rebooting in 70 seconds.. Interesting, I also experienced a spontaneous panic on my home firewall after upgrading it from 4.14.10 to 4.14.27, but I didn't have any symbol in the traces so the dump wasn't exploitable. All I know is that it was a NULL deref with a very small offset as well. It may be totally unrelated though but the coincidence is troubling, especially since I haven't had a panic in -stable for a very long time. Ah I've just seen your second e-mail. So if it's the same as the patch you pointed, the bug is 4.14-only and the fix as well. It will likely come with the next batch of networking backports. Cheers, Willy

7 years, 3 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror March 2018