- Linux-stable-mirror - lists.linaro.org

[PATCH] x86/entry/64: Remove %ebx handling from error_entry/exit

by Andy Lutomirski

error_entry and error_exit communicate the user vs kernel status of the frame using %ebx. This is unnecessary -- the information is in regs->cs. Just use regs->cs. This makes error_entry simpler and makes error_exit more robust. It also fixes a nasty bug. Before all the Spectre nonsense, The xen_failsafe_callback entry point returned like this: ALLOC_PT_GPREGS_ON_STACK SAVE_C_REGS SAVE_EXTRA_REGS ENCODE_FRAME_POINTER jmp error_exit And it did not go through error_entry. This was bogus: RBX contained garbage, and error_exit expected a flag in RBX. Fortunately, it generally contained *nonzero* garbage, so the correct code path was used. As part of the Spectre fixes, code was added to clear RBX to mitigate certain speculation attacks. Now, depending on kernel configuration, RBX got zeroed and, when running some Wine workloads, the kernel crashes. This was introduced by: commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface") With this patch applied, RBX is no longer needed as a flag, and the problem goes away. I suspect that malicious userspace could use this bug to crash the kernel even without the offending patch applied, though. [Historical note: I wrote this patch as a cleanup before I was aware of the bug it fixed.] [Note to stable maintainers: this should probably get applied to all kernels. If you're nervous about that, a more conservative fix to add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should also fix the problem.] Cc: Brian Gerst <brgerst(a)gmail.com> Cc: Borislav Petkov <bp(a)alien8.de> Cc: Dominik Brodowski <linux(a)dominikbrodowski.net> Cc: Ingo Molnar <mingo(a)redhat.com> Cc: "H. Peter Anvin" <hpa(a)zytor.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com> Cc: Juergen Gross <jgross(a)suse.com> Cc: xen-devel(a)lists.xenproject.org Cc: x86(a)kernel.org Cc: stable(a)vger.kernel.org Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface") Reported-and-tested-by: "M. Vefa Bicakci" <m.v.b(a)runbox.com> Signed-off-by: Andy Lutomirski <luto(a)kernel.org> --- I could also submit the conservative fix tagged for -stable and respin this on top of it. Ingo, Greg, what do you prefer? arch/x86/entry/entry_64.S | 18 ++++-------------- 1 file changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 73a522d53b53..8ae7ffda8f98 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -981,7 +981,7 @@ ENTRY(\sym) call \do_sym - jmp error_exit /* %ebx: no swapgs flag */ + jmp error_exit .endif END(\sym) .endm @@ -1222,7 +1222,6 @@ END(paranoid_exit) /* * Save all registers in pt_regs, and switch GS if needed. - * Return: EBX=0: came from user mode; EBX=1: otherwise */ ENTRY(error_entry) UNWIND_HINT_FUNC @@ -1269,7 +1268,6 @@ ENTRY(error_entry) * for these here too. */ .Lerror_kernelspace: - incl %ebx leaq native_irq_return_iret(%rip), %rcx cmpq %rcx, RIP+8(%rsp) je .Lerror_bad_iret @@ -1303,28 +1301,20 @@ ENTRY(error_entry) /* * Pretend that the exception came from user mode: set up pt_regs - * as if we faulted immediately after IRET and clear EBX so that - * error_exit knows that we will be returning to user mode. + * as if we faulted immediately after IRET. */ mov %rsp, %rdi call fixup_bad_iret mov %rax, %rsp - decl %ebx jmp .Lerror_entry_from_usermode_after_swapgs END(error_entry) - -/* - * On entry, EBX is a "return to kernel mode" flag: - * 1: already in kernel mode, don't need SWAPGS - * 0: user gsbase is loaded, we need SWAPGS and standard preparation for return to usermode - */ ENTRY(error_exit) UNWIND_HINT_REGS DISABLE_INTERRUPTS(CLBR_ANY) TRACE_IRQS_OFF - testl %ebx, %ebx - jnz retint_kernel + testb $3, CS(%rsp) + jz retint_kernel jmp retint_user END(error_exit) -- 2.17.1

7 years, 5 months

2
2
0 0

[PATCH] USB: option: add support for DW5821e

by Aleksander Morgado

The device exposes AT, NMEA and DIAG ports in both USB configurations. The patch explicitly ignores interfaces 0 and 1, as they're bound to other drivers already; and also interface 6, which is a GNSS interface for which we don't have a driver yet. T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 18 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2 P: Vendor=413c ProdID=81d7 Rev=03.18 S: Manufacturer=DELL S: Product=DW5821e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 7 Cfg#= 2 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option I: If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) T: Bus=01 Lev=03 Prnt=04 Port=00 Cnt=01 Dev#= 16 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=ef(misc ) Sub=02 Prot=01 MxPS=64 #Cfgs= 2 P: Vendor=413c ProdID=81d7 Rev=03.18 S: Manufacturer=DELL S: Product=DW5821e Snapdragon X20 LTE S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=a0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan I: If#= 1 Alt= 0 #EPs= 1 Cls=03(HID ) Sub=00 Prot=00 Driver=usbhid I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option Signed-off-by: Aleksander Morgado <aleksander(a)aleksander.es> Cc: stable <stable(a)vger.kernel.org> --- drivers/usb/serial/option.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/usb/serial/option.c b/drivers/usb/serial/option.c index 664e61f16b6a..0215b70c4efc 100644 --- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -196,6 +196,8 @@ static void option_instat_callback(struct urb *urb); #define DELL_PRODUCT_5800_V2_MINICARD_VZW 0x8196 /* Novatel E362 */ #define DELL_PRODUCT_5804_MINICARD_ATT 0x819b /* Novatel E371 */ +#define DELL_PRODUCT_5821E 0x81d7 + #define KYOCERA_VENDOR_ID 0x0c88 #define KYOCERA_PRODUCT_KPC650 0x17da #define KYOCERA_PRODUCT_KPC680 0x180a @@ -1030,6 +1032,8 @@ static const struct usb_device_id option_ids[] = { { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, DELL_PRODUCT_5800_MINICARD_VZW, 0xff, 0xff, 0xff) }, { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, DELL_PRODUCT_5800_V2_MINICARD_VZW, 0xff, 0xff, 0xff) }, { USB_DEVICE_AND_INTERFACE_INFO(DELL_VENDOR_ID, DELL_PRODUCT_5804_MINICARD_ATT, 0xff, 0xff, 0xff) }, + { USB_DEVICE(DELL_VENDOR_ID, DELL_PRODUCT_5821E), + .driver_info = RSVD(0) | RSVD(1) | RSVD(6) }, { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_E100A) }, /* ADU-E100, ADU-310 */ { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_500A) }, { USB_DEVICE(ANYDATA_VENDOR_ID, ANYDATA_PRODUCT_ADU_620UW) }, -- 2.18.0

7 years, 5 months

1
0
0 0

[merged] mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm: memcg: fix use after free in mem_cgroup_iter() has been removed from the -mm tree. Its filename was mm-memcg-fix-use-after-free-in-mem_cgroup_iter.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Jing Xia <jing.xia.mail(a)gmail.com> Subject: mm: memcg: fix use after free in mem_cgroup_iter() It was reported that a kernel crash happened in mem_cgroup_iter(), which can be triggered if the legacy cgroup-v1 non-hierarchical mode is used. Unable to handle kernel paging request at virtual address 6b6b6b6b6b6b8f ...... Call trace: mem_cgroup_iter+0x2e0/0x6d4 shrink_zone+0x8c/0x324 balance_pgdat+0x450/0x640 kswapd+0x130/0x4b8 kthread+0xe8/0xfc ret_from_fork+0x10/0x20 mem_cgroup_iter(): ...... if (css_tryget(css)) <-- crash here break; ...... The crashing reason is that mem_cgroup_iter() uses the memcg object whose pointer is stored in iter->position, which has been freed before and filled with POISON_FREE(0x6b). And the root cause of the use-after-free issue is that invalidate_reclaim_iterators() fails to reset the value of iter->position to NULL when the css of the memcg is released in non- hierarchical mode. Link: http://lkml.kernel.org/r/1531994807-25639-1-git-send-email-jing.xia@unisoc.… Fixes: 6df38689e0e9 ("mm: memcontrol: fix possible memcg leak due to interrupted reclaim") Signed-off-by: Jing Xia <jing.xia.mail(a)gmail.com> Acked-by: Michal Hocko <mhocko(a)suse.com> Cc: Johannes Weiner <hannes(a)cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev(a)gmail.com> Cc: <chunyan.zhang(a)unisoc.com> Cc: Shakeel Butt <shakeelb(a)google.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/memcontrol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter mm/memcontrol.c --- a/mm/memcontrol.c~mm-memcg-fix-use-after-free-in-mem_cgroup_iter +++ a/mm/memcontrol.c @@ -850,7 +850,7 @@ static void invalidate_reclaim_iterators int nid; int i; - while ((memcg = parent_mem_cgroup(memcg))) { + for (; memcg; memcg = parent_mem_cgroup(memcg)) { for_each_node(nid) { mz = mem_cgroup_nodeinfo(memcg, nid); for (i = 0; i <= DEF_PRIORITY; i++) { _ Patches currently in -mm which might be from jing.xia.mail(a)gmail.com are

7 years, 5 months

1
0
0 0

[merged] thp-fix-data-loss-when-splitting-a-file-pmd.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/huge_memory.c: fix data loss when splitting a file pmd has been removed from the -mm tree. Its filename was thp-fix-data-loss-when-splitting-a-file-pmd.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm/huge_memory.c: fix data loss when splitting a file pmd __split_huge_pmd_locked() must check if the cleared huge pmd was dirty, and propagate that to PageDirty: otherwise, data may be lost when a huge tmpfs page is modified then split then reclaimed. How has this taken so long to be noticed? Because there was no problem when the huge page is written by a write system call (shmem_write_end() calls set_page_dirty()), nor when the page is allocated for a write fault (fault_dirty_shared_page() calls set_page_dirty()); but when allocated for a read fault (which MAP_POPULATE simulates), no set_page_dirty(). Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1807111741430.1106@eggly.anvils Fixes: d21b9e57c74c ("thp: handle file pages in split_huge_pmd()") Signed-off-by: Hugh Dickins <hughd(a)google.com> Reported-by: Ashwin Chaugule <ashwinch(a)google.com> Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com> Reviewed-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: "Huang, Ying" <ying.huang(a)intel.com> Cc: <stable(a)vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 2 ++ 1 file changed, 2 insertions(+) diff -puN mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd mm/huge_memory.c --- a/mm/huge_memory.c~thp-fix-data-loss-when-splitting-a-file-pmd +++ a/mm/huge_memory.c @@ -2084,6 +2084,8 @@ static void __split_huge_pmd_locked(stru if (vma_is_dax(vma)) return; page = pmd_page(_pmd); + if (!PageDirty(page) && pmd_dirty(_pmd)) + set_page_dirty(page); if (!PageReferenced(page) && pmd_young(_pmd)) SetPageReferenced(page); page_remove_rmap(page, true); _ Patches currently in -mm which might be from hughd(a)google.com are

7 years, 5 months

1
0
0 0

[merged] fat-fix-memory-allocation-failure-handling-of-match_strdup.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: fat: fix memory allocation failure handling of match_strdup() has been removed from the -mm tree. Its filename was fat-fix-memory-allocation-failure-handling-of-match_strdup.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp> Subject: fat: fix memory allocation failure handling of match_strdup() In parse_options(), if match_strdup() failed, parse_options() leaves opts->iocharset in unexpected state (i.e. still pointing the freed string). And this can be the cause of double free. To fix, this initialize opts->iocharset always when freeing. Link: http://lkml.kernel.org/r/8736wp9dzc.fsf@mail.parknet.co.jp Signed-off-by: OGAWA Hirofumi <hirofumi(a)mail.parknet.co.jp> Reported-by: syzbot+90b8e10515ae88228a92(a)syzkaller.appspotmail.com Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/fat/inode.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff -puN fs/fat/inode.c~fat-fix-memory-allocation-failure-handling-of-match_strdup fs/fat/inode.c --- a/fs/fat/inode.c~fat-fix-memory-allocation-failure-handling-of-match_strdup +++ a/fs/fat/inode.c @@ -707,13 +707,21 @@ static void fat_set_state(struct super_b brelse(bh); } +static void fat_reset_iocharset(struct fat_mount_options *opts) +{ + if (opts->iocharset != fat_default_iocharset) { + /* Note: opts->iocharset can be NULL here */ + kfree(opts->iocharset); + opts->iocharset = fat_default_iocharset; + } +} + static void delayed_free(struct rcu_head *p) { struct msdos_sb_info *sbi = container_of(p, struct msdos_sb_info, rcu); unload_nls(sbi->nls_disk); unload_nls(sbi->nls_io); - if (sbi->options.iocharset != fat_default_iocharset) - kfree(sbi->options.iocharset); + fat_reset_iocharset(&sbi->options); kfree(sbi); } @@ -1132,7 +1140,7 @@ static int parse_options(struct super_bl opts->fs_fmask = opts->fs_dmask = current_umask(); opts->allow_utime = -1; opts->codepage = fat_default_codepage; - opts->iocharset = fat_default_iocharset; + fat_reset_iocharset(opts); if (is_vfat) { opts->shortname = VFAT_SFN_DISPLAY_WINNT|VFAT_SFN_CREATE_WIN95; opts->rodir = 0; @@ -1289,8 +1297,7 @@ static int parse_options(struct super_bl /* vfat specific */ case Opt_charset: - if (opts->iocharset != fat_default_iocharset) - kfree(opts->iocharset); + fat_reset_iocharset(opts); iocharset = match_strdup(&args[0]); if (!iocharset) return -ENOMEM; @@ -1881,8 +1888,7 @@ out_fail: iput(fat_inode); unload_nls(sbi->nls_io); unload_nls(sbi->nls_disk); - if (sbi->options.iocharset != fat_default_iocharset) - kfree(sbi->options.iocharset); + fat_reset_iocharset(&sbi->options); sb->s_fs_info = NULL; kfree(sbi); return error; _ Patches currently in -mm which might be from hirofumi(a)mail.parknet.co.jp are fat-validate-i_start-before-using.patch

7 years, 5 months

1
0
0 0

[alternative-merged] mm-fix-vma_is_anonymous-false-positives.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm: fix vma_is_anonymous() false-positives has been removed from the -mm tree. Its filename was mm-fix-vma_is_anonymous-false-positives.patch This patch was dropped because an alternative patch was merged ------------------------------------------------------ From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com> Subject: mm: fix vma_is_anonymous() false-positives vma_is_anonymous() relies on ->vm_ops being NULL to detect anonymous VMA. This is unreliable as ->mmap may not set ->vm_ops. False-positive vma_is_anonymous() may lead to crashes: next ffff8801ce5e7040 prev ffff8801d20eca50 mm ffff88019c1e13c0 prot 27 anon_vma ffff88019680cdd8 vm_ops 0000000000000000 pgoff 0 file ffff8801b2ec2d00 private_data 0000000000000000 flags: 0xff(read|write|exec|shared|mayread|maywrite|mayexec|mayshare) ------------[ cut here ]------------ kernel BUG at mm/memory.c:1422! invalid opcode: 0000 [#1] SMP KASAN CPU: 0 PID: 18486 Comm: syz-executor3 Not tainted 4.18.0-rc3+ #136 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:zap_pmd_range mm/memory.c:1421 [inline] RIP: 0010:zap_pud_range mm/memory.c:1466 [inline] RIP: 0010:zap_p4d_range mm/memory.c:1487 [inline] RIP: 0010:unmap_page_range+0x1c18/0x2220 mm/memory.c:1508 Code: ff 31 ff 4c 89 e6 42 c6 04 33 f8 e8 92 dd d0 ff 4d 85 e4 0f 85 4a eb ff ff e8 54 dc d0 ff 48 8b bd 10 fc ff ff e8 82 95 fe ff <0f> 0b e8 41 dc d0 ff 0f 0b 4c 89 ad 18 fc ff ff c7 85 7c fb ff ff RSP: 0018:ffff8801b0587330 EFLAGS: 00010286 RAX: 000000000000013c RBX: 1ffff100360b0e9c RCX: ffffc90002620000 RDX: 0000000000000000 RSI: ffffffff81631851 RDI: 0000000000000001 RBP: ffff8801b05877c8 R08: ffff880199d40300 R09: ffffed003b5c4fc0 R10: ffffed003b5c4fc0 R11: ffff8801dae27e07 R12: 0000000000000000 R13: ffff88019c1e13c0 R14: dffffc0000000000 R15: 0000000020e01000 FS: 00007fca32251700(0000) GS:ffff8801dae00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f04c540d000 CR3: 00000001ac1f0000 CR4: 00000000001426f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: unmap_single_vma+0x1a0/0x310 mm/memory.c:1553 zap_page_range_single+0x3cc/0x580 mm/memory.c:1644 unmap_mapping_range_vma mm/memory.c:2792 [inline] unmap_mapping_range_tree mm/memory.c:2813 [inline] unmap_mapping_pages+0x3a7/0x5b0 mm/memory.c:2845 unmap_mapping_range+0x48/0x60 mm/memory.c:2880 truncate_pagecache+0x54/0x90 mm/truncate.c:800 truncate_setsize+0x70/0xb0 mm/truncate.c:826 simple_setattr+0xe9/0x110 fs/libfs.c:409 notify_change+0xf13/0x10f0 fs/attr.c:335 do_truncate+0x1ac/0x2b0 fs/open.c:63 do_sys_ftruncate+0x492/0x560 fs/open.c:205 __do_sys_ftruncate fs/open.c:215 [inline] __se_sys_ftruncate fs/open.c:213 [inline] __x64_sys_ftruncate+0x59/0x80 fs/open.c:213 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Reproducer: #include <stdio.h> #include <stddef.h> #include <stdint.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/ioctl.h> #include <sys/mman.h> #include <unistd.h> #include <fcntl.h> #define KCOV_INIT_TRACE _IOR('c', 1, unsigned long) #define KCOV_ENABLE _IO('c', 100) #define KCOV_DISABLE _IO('c', 101) #define COVER_SIZE (1024<<10) #define KCOV_TRACE_PC 0 #define KCOV_TRACE_CMP 1 int main(int argc, char **argv) { int fd; unsigned long *cover; system("mount -t debugfs none /sys/kernel/debug"); fd = open("/sys/kernel/debug/kcov", O_RDWR); ioctl(fd, KCOV_INIT_TRACE, COVER_SIZE); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); munmap(cover, COVER_SIZE * sizeof(unsigned long)); cover = mmap(NULL, COVER_SIZE * sizeof(unsigned long), PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0); memset(cover, 0, COVER_SIZE * sizeof(unsigned long)); ftruncate(fd, 3UL << 20); return 0; } This can be fixed by assigning anonymous VMAs own vm_ops and not relying on it being NULL. If ->mmap() failed to set ->vm_ops, mmap_region() will set it to dummy_vm_ops. This way we will have non-NULL ->vm_ops for all VMAs. [kirill(a)shutemov.name: add comments] Link: http://lkml.kernel.org/r/20180711121521.omugjfpuuyxscjjf@kshutemo-mobl1 [kirill.shutemov(a)linux.intel.com: v2] Link: http://lkml.kernel.org/r/20180712145626.41665-2-kirill.shutemov@linux.intel… [kirill.shutemov(a)linux.intel.com: fix splat reported by Marcel] Link: http://lkml.kernel.org/r/20180716142049.ioa2irsd2d7sphn4@black.fi.intel.com Link: http://lkml.kernel.org/r/20180710134821.84709-2-kirill.shutemov@linux.intel… Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Reported-by: syzbot+3f84280d52be9b7083cc(a)syzkaller.appspotmail.com Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: Oleg Nesterov <oleg(a)redhat.com> Cc: Marcel Ziswiler <marcel.ziswiler(a)toradex.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- arch/arm/kernel/process.c | 1 + arch/ia64/kernel/perfmon.c | 1 + arch/ia64/mm/init.c | 2 ++ drivers/char/mem.c | 1 + fs/exec.c | 1 + fs/hugetlbfs/inode.c | 1 + include/linux/mm.h | 5 ++++- mm/khugepaged.c | 4 ++-- mm/mmap.c | 11 +++++++++++ mm/nommu.c | 9 ++++++++- mm/shmem.c | 1 + mm/util.c | 12 ++++++++++++ 12 files changed, 45 insertions(+), 4 deletions(-) diff -puN arch/arm/kernel/process.c~mm-fix-vma_is_anonymous-false-positives arch/arm/kernel/process.c --- a/arch/arm/kernel/process.c~mm-fix-vma_is_anonymous-false-positives +++ a/arch/arm/kernel/process.c @@ -334,6 +334,7 @@ static struct vm_area_struct gate_vma = .vm_start = 0xffff0000, .vm_end = 0xffff0000 + PAGE_SIZE, .vm_flags = VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYEXEC, + .vm_ops = &dummy_vm_ops, }; static int __init gate_vma_init(void) diff -puN arch/ia64/kernel/perfmon.c~mm-fix-vma_is_anonymous-false-positives arch/ia64/kernel/perfmon.c --- a/arch/ia64/kernel/perfmon.c~mm-fix-vma_is_anonymous-false-positives +++ a/arch/ia64/kernel/perfmon.c @@ -2292,6 +2292,7 @@ pfm_smpl_buffer_alloc(struct task_struct vma->vm_file = get_file(filp); vma->vm_flags = VM_READ|VM_MAYREAD|VM_DONTEXPAND|VM_DONTDUMP; vma->vm_page_prot = PAGE_READONLY; /* XXX may need to change */ + vma->vm_ops = &dummy_vm_ops; /* * Now we have everything we need and we can initialize diff -puN arch/ia64/mm/init.c~mm-fix-vma_is_anonymous-false-positives arch/ia64/mm/init.c --- a/arch/ia64/mm/init.c~mm-fix-vma_is_anonymous-false-positives +++ a/arch/ia64/mm/init.c @@ -122,6 +122,7 @@ ia64_init_addr_space (void) vma->vm_end = vma->vm_start + PAGE_SIZE; vma->vm_flags = VM_DATA_DEFAULT_FLAGS|VM_GROWSUP|VM_ACCOUNT; vma->vm_page_prot = vm_get_page_prot(vma->vm_flags); + vma->vm_ops = &dummy_vm_ops; down_write(&current->mm->mmap_sem); if (insert_vm_struct(current->mm, vma)) { up_write(&current->mm->mmap_sem); @@ -141,6 +142,7 @@ ia64_init_addr_space (void) vma->vm_page_prot = __pgprot(pgprot_val(PAGE_READONLY) | _PAGE_MA_NAT); vma->vm_flags = VM_READ | VM_MAYREAD | VM_IO | VM_DONTEXPAND | VM_DONTDUMP; + vma->vm_ops = &dummy_vm_ops; down_write(&current->mm->mmap_sem); if (insert_vm_struct(current->mm, vma)) { up_write(&current->mm->mmap_sem); diff -puN drivers/char/mem.c~mm-fix-vma_is_anonymous-false-positives drivers/char/mem.c --- a/drivers/char/mem.c~mm-fix-vma_is_anonymous-false-positives +++ a/drivers/char/mem.c @@ -708,6 +708,7 @@ static int mmap_zero(struct file *file, #endif if (vma->vm_flags & VM_SHARED) return shmem_zero_setup(vma); + vma->vm_ops = &anon_vm_ops; return 0; } diff -puN fs/exec.c~mm-fix-vma_is_anonymous-false-positives fs/exec.c --- a/fs/exec.c~mm-fix-vma_is_anonymous-false-positives +++ a/fs/exec.c @@ -307,6 +307,7 @@ static int __bprm_mm_init(struct linux_b * configured yet. */ BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP); + vma->vm_ops = &anon_vm_ops; vma->vm_end = STACK_TOP_MAX; vma->vm_start = vma->vm_end - PAGE_SIZE; vma->vm_flags = VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP; diff -puN fs/hugetlbfs/inode.c~mm-fix-vma_is_anonymous-false-positives fs/hugetlbfs/inode.c --- a/fs/hugetlbfs/inode.c~mm-fix-vma_is_anonymous-false-positives +++ a/fs/hugetlbfs/inode.c @@ -597,6 +597,7 @@ static long hugetlbfs_fallocate(struct f memset(&pseudo_vma, 0, sizeof(struct vm_area_struct)); pseudo_vma.vm_flags = (VM_HUGETLB | VM_MAYSHARE | VM_SHARED); pseudo_vma.vm_file = file; + pseudo_vma.vm_ops = &dummy_vm_ops; for (index = start; index < end; index++) { /* diff -puN include/linux/mm.h~mm-fix-vma_is_anonymous-false-positives include/linux/mm.h --- a/include/linux/mm.h~mm-fix-vma_is_anonymous-false-positives +++ a/include/linux/mm.h @@ -1536,9 +1536,12 @@ int clear_page_dirty_for_io(struct page int get_cmdline(struct task_struct *task, char *buffer, int buflen); +extern const struct vm_operations_struct anon_vm_ops; +extern const struct vm_operations_struct dummy_vm_ops; + static inline bool vma_is_anonymous(struct vm_area_struct *vma) { - return !vma->vm_ops; + return vma->vm_ops == &anon_vm_ops; } #ifdef CONFIG_SHMEM diff -puN mm/khugepaged.c~mm-fix-vma_is_anonymous-false-positives mm/khugepaged.c --- a/mm/khugepaged.c~mm-fix-vma_is_anonymous-false-positives +++ a/mm/khugepaged.c @@ -440,7 +440,7 @@ int khugepaged_enter_vma_merge(struct vm * page fault if needed. */ return 0; - if (vma->vm_ops || (vm_flags & VM_NO_KHUGEPAGED)) + if (!vma_is_anonymous(vma) || (vm_flags & VM_NO_KHUGEPAGED)) /* khugepaged not yet working on file or special mappings */ return 0; hstart = (vma->vm_start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK; @@ -831,7 +831,7 @@ static bool hugepage_vma_check(struct vm return IS_ALIGNED((vma->vm_start >> PAGE_SHIFT) - vma->vm_pgoff, HPAGE_PMD_NR); } - if (!vma->anon_vma || vma->vm_ops) + if (!vma->anon_vma || !vma_is_anonymous(vma)) return false; if (is_vma_temporary_stack(vma)) return false; diff -puN mm/mmap.c~mm-fix-vma_is_anonymous-false-positives mm/mmap.c --- a/mm/mmap.c~mm-fix-vma_is_anonymous-false-positives +++ a/mm/mmap.c @@ -561,6 +561,8 @@ static unsigned long count_vma_pages_ran void __vma_link_rb(struct mm_struct *mm, struct vm_area_struct *vma, struct rb_node **rb_link, struct rb_node *rb_parent) { + WARN_ONCE(!vma->vm_ops, "missing vma->vm_ops"); + /* Update tracking information for the gap following the new vma. */ if (vma->vm_next) vma_gap_update(vma->vm_next); @@ -1762,6 +1764,11 @@ unsigned long mmap_region(struct file *f */ vma->vm_file = get_file(file); error = call_mmap(file, vma); + + /* All mappings must have ->vm_ops set */ + if (!vma->vm_ops) + vma->vm_ops = &dummy_vm_ops; + if (error) goto unmap_and_free_vma; @@ -1780,6 +1787,9 @@ unsigned long mmap_region(struct file *f error = shmem_zero_setup(vma); if (error) goto free_vma; + } else { + /* vma_is_anonymous() relies on this. */ + vma->vm_ops = &anon_vm_ops; } vma_link(mm, vma, prev, rb_link, rb_parent); @@ -2992,6 +3002,7 @@ static int do_brk_flags(unsigned long ad INIT_LIST_HEAD(&vma->anon_vma_chain); vma->vm_mm = mm; + vma->vm_ops = &anon_vm_ops; vma->vm_start = addr; vma->vm_end = addr + len; vma->vm_pgoff = pgoff; diff -puN mm/nommu.c~mm-fix-vma_is_anonymous-false-positives mm/nommu.c --- a/mm/nommu.c~mm-fix-vma_is_anonymous-false-positives +++ a/mm/nommu.c @@ -1058,6 +1058,8 @@ static int do_mmap_shared_file(struct vm int ret; ret = call_mmap(vma->vm_file, vma); + if (!vma->vm_ops) + vma->vm_ops = &dummy_vm_ops; if (ret == 0) { vma->vm_region->vm_top = vma->vm_region->vm_end; return 0; @@ -1089,6 +1091,8 @@ static int do_mmap_private(struct vm_are */ if (capabilities & NOMMU_MAP_DIRECT) { ret = call_mmap(vma->vm_file, vma); + if (!vma->vm_ops) + vma->vm_ops = &dummy_vm_ops; if (ret == 0) { /* shouldn't return success if we're not sharing */ BUG_ON(!(vma->vm_flags & VM_MAYSHARE)); @@ -1137,6 +1141,8 @@ static int do_mmap_private(struct vm_are fpos = vma->vm_pgoff; fpos <<= PAGE_SHIFT; + vma->vm_ops = &dummy_vm_ops; + ret = kernel_read(vma->vm_file, base, len, &fpos); if (ret < 0) goto error_free; @@ -1144,7 +1150,8 @@ static int do_mmap_private(struct vm_are /* clear the last little bit */ if (ret < len) memset(base + ret, 0, len - ret); - + } else { + vma->vm_ops = &anon_vm_ops; } return 0; diff -puN mm/shmem.c~mm-fix-vma_is_anonymous-false-positives mm/shmem.c --- a/mm/shmem.c~mm-fix-vma_is_anonymous-false-positives +++ a/mm/shmem.c @@ -1424,6 +1424,7 @@ static void shmem_pseudo_vma_init(struct /* Bias interleave by inode number to distribute better across nodes */ vma->vm_pgoff = index + info->vfs_inode.i_ino; vma->vm_policy = mpol_shared_policy_lookup(&info->policy, index); + vma->vm_ops = &dummy_vm_ops; } static void shmem_pseudo_vma_destroy(struct vm_area_struct *vma) diff -puN mm/util.c~mm-fix-vma_is_anonymous-false-positives mm/util.c --- a/mm/util.c~mm-fix-vma_is_anonymous-false-positives +++ a/mm/util.c @@ -20,6 +20,18 @@ #include "internal.h" +/* + * All anonymous VMAs have ->vm_ops set to anon_vm_ops. + * vma_is_anonymous() reiles on anon_vm_ops to detect anonymous VMA. + */ +const struct vm_operations_struct anon_vm_ops = {}; + +/* + * All VMAs have to have ->vm_ops set. dummy_vm_ops can be used if the VMA + * doesn't need to handle any of the operations. + */ +const struct vm_operations_struct dummy_vm_ops = {}; + static inline int is_kernel_rodata(unsigned long addr) { return addr >= (unsigned long)__start_rodata && _ Patches currently in -mm which might be from kirill.shutemov(a)linux.intel.com are mm-page_ext-drop-definition-of-unused-page_ext_debug_poison.patch mm-page_ext-constify-lookup_page_ext-argument.patch mm-drop-unneeded-vm_ops-checks-v2.patch

7 years, 5 months

1
0
0 0

[PATCH] MIPS: Change definition of cpu_relax() for Loongson-3

by Huacai Chen

Linux expects that if a CPU modifies a memory location, then that modification will eventually become visible to other CPUs in the system. On Loongson-3 processor with SFB (Store Fill Buffer), loads may be prioritised over stores so it is possible for a store operation to be postponed if a polling loop immediately follows it. If the variable being polled indirectly depends on the outstanding store [for example, another CPU may be polling the variable that is pending modification] then there is the potential for deadlock if interrupts are disabled. This deadlock occurs in qspinlock code. This patch changes the definition of cpu_relax() to smp_mb() for Loongson-3, forcing a flushing of the SFB on SMP systems before the next load takes place. If the Kernel is not compiled for SMP support, this will expand to a barrier() as before. References: 534be1d5a2da940 (ARM: 6194/1: change definition of cpu_relax() for ARM11MPCore) Cc: stable(a)vger.kernel.org Signed-off-by: Huacai Chen <chenhc(a)lemote.com> --- arch/mips/include/asm/processor.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h index af34afb..a8c4a3a 100644 --- a/arch/mips/include/asm/processor.h +++ b/arch/mips/include/asm/processor.h @@ -386,7 +386,17 @@ unsigned long get_wchan(struct task_struct *p); #define KSTK_ESP(tsk) (task_pt_regs(tsk)->regs[29]) #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status) +#ifdef CONFIG_CPU_LOONGSON3 +/* + * Loongson-3's SFB (Store-Fill-Buffer) may get starved when stuck in a read + * loop. Since spin loops of any kind should have a cpu_relax() in them, force + * a Store-Fill-Buffer flush from cpu_relax() such that any pending writes will + * become available as expected. + */ +#define cpu_relax() smp_mb() +#else #define cpu_relax() barrier() +#endif /* * Return_address is a replacement for __builtin_return_address(count) -- 2.7.0

7 years, 5 months

5
7
0 0

FAILED: patch "[PATCH] drm/nouveau/drm/nouveau: Fix runtime PM leak in" failed to apply to 4.17-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.17-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From e5d54f1935722f83df7619f3978f774c2b802cd8 Mon Sep 17 00:00:00 2001 From: Lyude Paul <lyude(a)redhat.com> Date: Thu, 12 Jul 2018 13:02:53 -0400 Subject: [PATCH] drm/nouveau/drm/nouveau: Fix runtime PM leak in nv50_disp_atomic_commit() A CRTC being enabled doesn't mean it's on! It doesn't even necessarily mean it's being used. This fixes runtime PM leaks on the P50 I've got next to me. Signed-off-by: Lyude Paul <lyude(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs(a)redhat.com> diff --git a/drivers/gpu/drm/nouveau/dispnv50/disp.c b/drivers/gpu/drm/nouveau/dispnv50/disp.c index 9382e99a0bc7..31b12b4f321a 100644 --- a/drivers/gpu/drm/nouveau/dispnv50/disp.c +++ b/drivers/gpu/drm/nouveau/dispnv50/disp.c @@ -1878,7 +1878,7 @@ nv50_disp_atomic_commit(struct drm_device *dev, nv50_disp_atomic_commit_tail(state); drm_for_each_crtc(crtc, dev) { - if (crtc->state->enable) { + if (crtc->state->active) { if (!drm->have_disp_power_ref) { drm->have_disp_power_ref = true; return 0;

7 years, 5 months

2
1
0 0

[stable-4.14 00/23] block/scsi multiqueue performance enhancement and

by Jack Wang

Hi Greg, Please consider this patchset, which include block/scsi multiqueue performance enhancement and bugfix. We've run multiple benchmark and different tests for over one week, looks good. These patches are also included in Oracle UEK5. They're almost just simple cherry-pick, only 2 patches need minor adjust. They can apply cleanly on 4.14.57. Jens Axboe (3): Revert "blk-mq: don't handle TAG_SHARED in restart" blk-mq: fix issue with shared tag queue re-running blk-mq: only run the hardware queue if IO is pending Jianchao Wang (1): blk-mq: put the driver tag of nxt rq before first one is requeued Ming Lei (19): blk-mq-sched: move actual dispatching into one helper blk-mq: introduce .get_budget and .put_budget in blk_mq_ops sbitmap: introduce __sbitmap_for_each_set() blk-mq-sched: improve dispatching from sw queue scsi: allow passing in null rq to scsi_prep_state_check() scsi: implement .get_budget and .put_budget for blk-mq SCSI: don't get target/host busy_count in scsi_mq_get_budget() blk-mq: don't handle TAG_SHARED in restart blk-mq: don't restart queue when .get_budget returns BLK_STS_RESOURCE blk-mq: don't handle failure in .get_budget blk-flush: don't run queue for requests bypassing flush block: pass 'run_queue' to blk_mq_request_bypass_insert blk-flush: use blk_mq_request_bypass_insert() blk-mq-sched: decide how to handle flush rq via RQF_FLUSH_SEQ blk-mq: move blk_mq_put_driver_tag*() into blk-mq.h blk-mq: don't allocate driver tag upfront for flush rq blk-mq: put driver tag if dispatch budget can't be got blk-mq: quiesce queue during switching io sched and updating nr_requests scsi: core: run queue if SCSI device queue isn't ready and queue is idle block/blk-core.c | 2 +- block/blk-flush.c | 37 +++++-- block/blk-mq-debugfs.c | 1 - block/blk-mq-sched.c | 203 ++++++++++++++++++++++------------- block/blk-mq.c | 278 +++++++++++++++++++++++++++--------------------- block/blk-mq.h | 58 +++++++++- block/elevator.c | 2 + drivers/scsi/scsi_lib.c | 53 ++++++--- include/linux/blk-mq.h | 20 +++- include/linux/sbitmap.h | 64 ++++++++--- 10 files changed, 475 insertions(+), 243 deletions(-) -- 2.7.4

7 years, 5 months

4
31
0 0

[PATCH] cxl: Fix wrong comparison in cxl_adapter_context_get()

by Vaibhav Jain

Function atomic_inc_unless_negative() returns a bool to indicate success/failure. However cxl_adapter_context_get() wrongly compares the return value against '>=0' which will always be true. The patch fixes this comparison to '==0' there by also fixing this compile time warning: drivers/misc/cxl/main.c:290 cxl_adapter_context_get() warn: 'atomic_inc_unless_negative(&adapter->contexts_num)' is unsigned Cc: stable(a)vger.kernel.org Fixes: 70b565bbdb91 ("cxl: Prevent adapter reset if an active context exists") Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com> Signed-off-by: Vaibhav Jain <vaibhav(a)linux.ibm.com> --- drivers/misc/cxl/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/misc/cxl/main.c b/drivers/misc/cxl/main.c index c1ba0d42cbc8..e0f29b8a872d 100644 --- a/drivers/misc/cxl/main.c +++ b/drivers/misc/cxl/main.c @@ -287,7 +287,7 @@ int cxl_adapter_context_get(struct cxl *adapter) int rc; rc = atomic_inc_unless_negative(&adapter->contexts_num); - return rc >= 0 ? 0 : -EBUSY; + return rc ? 0 : -EBUSY; } void cxl_adapter_context_put(struct cxl *adapter) -- 2.17.1

7 years, 5 months

4
3
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror