The patch titled Subject: mm: do not bug_on on incorrect length in __mm_populate() has been added to the -mm tree. Its filename is mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate.patch
This patch should soon appear at http://ozlabs.org/~akpm/mmots/broken-out/mm-do-not-bug_on-on-incorrect-lengh... and later at http://ozlabs.org/~akpm/mmotm/broken-out/mm-do-not-bug_on-on-incorrect-lengh...
Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated there every 3-4 working days
------------------------------------------------------ From: Michal Hocko mhocko@suse.com Subject: mm: do not bug_on on incorrect length in __mm_populate()
syzbot has noticed that a specially crafted library can easily hit VM_BUG_ON in __mm_populate
localhost login: [ 81.210241] emacs (9634) used greatest stack depth: 10416 bytes left [ 140.099935] ------------[ cut here ]------------ [ 140.101904] kernel BUG at mm/gup.c:1242! [ 140.103572] invalid opcode: 0000 [#1] SMP [ 140.105220] CPU: 2 PID: 9667 Comm: a.out Not tainted 4.18.0-rc3 #644 [ 140.107762] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/19/2017 [ 140.112000] RIP: 0010:__mm_populate+0x1e2/0x1f0 [ 140.113875] Code: 55 d0 65 48 33 14 25 28 00 00 00 89 d8 75 21 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 e8 75 18 f1 ff 0f 0b e8 6e 18 f1 ff <0f> 0b 31 db eb c9 e8 93 06 e0 ff 0f 1f 00 55 48 89 e5 53 48 89 fb [ 140.121403] RSP: 0018:ffffc90000dffd78 EFLAGS: 00010293 [ 140.123516] RAX: ffff8801366c63c0 RBX: 000000007bf81000 RCX: ffffffff813e4ee2 [ 140.126352] RDX: 0000000000000000 RSI: 0000000000007676 RDI: 000000007bf81000 [ 140.129236] RBP: ffffc90000dffdc0 R08: 0000000000000000 R09: 0000000000000000 [ 140.132110] R10: ffff880135895c80 R11: 0000000000000000 R12: 0000000000007676 [ 140.134955] R13: 0000000000008000 R14: 0000000000000000 R15: 0000000000007676 [ 140.137785] FS: 0000000000000000(0000) GS:ffff88013a680000(0063) knlGS:00000000f7db9700 [ 140.140998] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 140.143303] CR2: 00000000f7ea56e0 CR3: 0000000134674004 CR4: 00000000000606e0 [ 140.145906] Call Trace: [ 140.146728] vm_brk_flags+0xc3/0x100 [ 140.147830] vm_brk+0x1f/0x30 [ 140.148714] load_elf_library+0x281/0x2e0 [ 140.149875] __ia32_sys_uselib+0x170/0x1e0 [ 140.151028] ? copy_overflow+0x30/0x30 [ 140.152105] ? __ia32_sys_uselib+0x170/0x1e0 [ 140.153301] do_fast_syscall_32+0xca/0x420 [ 140.154455] entry_SYSENTER_compat+0x70/0x7f
The reason is that the length of the new brk is not page aligned when we try to populate the it. There is no reason to bug on that though. do_brk_flags already aligns the length properly so the mapping is expanded as it should. All we need is to tell mm_populate about it. Besides that there is absolutely no reason to to bug_on in the first place. The worst thing that could happen is that the last page wouldn't get populated and that is far from putting system into an inconsistent state.
Fix the issue by moving the length sanitization code from do_brk_flags up to vm_brk_flags. The only other caller of do_brk_flags is brk syscall entry and it makes sure to provide the proper length so t here is no need for sanitation and so we can use do_brk_flags without it.
Also remove the bogus BUG_ONs.
[osalvador@techadventures.net: fix up vm_brk_flags s@request@len@] Link: http://lkml.kernel.org/r/20180706090217.GI32658@dhcp22.suse.cz Signed-off-by: Michal Hocko mhocko@suse.com Reported-by: syzbot syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com Tested-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Cc: Oscar Salvador osalvador@techadventures.net Cc: Zi Yan zi.yan@cs.rutgers.edu Cc: "Aneesh Kumar K.V" aneesh.kumar@linux.vnet.ibm.com Cc: Dan Williams dan.j.williams@intel.com Cc: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com Cc: Michael S. Tsirkin mst@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: "Huang, Ying" ying.huang@intel.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org ---
mm/gup.c | 2 -- mm/mmap.c | 29 ++++++++++++----------------- 2 files changed, 12 insertions(+), 19 deletions(-)
diff -puN mm/gup.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate mm/gup.c --- a/mm/gup.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate +++ a/mm/gup.c @@ -1238,8 +1238,6 @@ int __mm_populate(unsigned long start, u int locked = 0; long ret = 0;
- VM_BUG_ON(start & ~PAGE_MASK); - VM_BUG_ON(len != PAGE_ALIGN(len)); end = start + len;
for (nstart = start; nstart < end; nstart = nend) { diff -puN mm/mmap.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate mm/mmap.c --- a/mm/mmap.c~mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate +++ a/mm/mmap.c @@ -186,8 +186,8 @@ static struct vm_area_struct *remove_vma return next; }
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf); - +static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags, + struct list_head *uf); SYSCALL_DEFINE1(brk, unsigned long, brk) { unsigned long retval; @@ -245,7 +245,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk) goto out;
/* Ok, looks good - let it rip. */ - if (do_brk(oldbrk, newbrk-oldbrk, &uf) < 0) + if (do_brk_flags(oldbrk, newbrk-oldbrk, 0, &uf) < 0) goto out;
set_brk: @@ -2929,21 +2929,14 @@ static inline void verify_mm_writelocked * anonymous maps. eventually we may be able to do some * brk-specific accounting here. */ -static int do_brk_flags(unsigned long addr, unsigned long request, unsigned long flags, struct list_head *uf) +static int do_brk_flags(unsigned long addr, unsigned long len, unsigned long flags, struct list_head *uf) { struct mm_struct *mm = current->mm; struct vm_area_struct *vma, *prev; - unsigned long len; struct rb_node **rb_link, *rb_parent; pgoff_t pgoff = addr >> PAGE_SHIFT; int error;
- len = PAGE_ALIGN(request); - if (len < request) - return -ENOMEM; - if (!len) - return 0; - /* Until we need other flags, refuse anything except VM_EXEC. */ if ((flags & (~VM_EXEC)) != 0) return -EINVAL; @@ -3015,18 +3008,20 @@ out: return 0; }
-static int do_brk(unsigned long addr, unsigned long len, struct list_head *uf) -{ - return do_brk_flags(addr, len, 0, uf); -} - -int vm_brk_flags(unsigned long addr, unsigned long len, unsigned long flags) +int vm_brk_flags(unsigned long addr, unsigned long request, unsigned long flags) { struct mm_struct *mm = current->mm; + unsigned long len; int ret; bool populate; LIST_HEAD(uf);
+ len = PAGE_ALIGN(request); + if (len < request) + return -ENOMEM; + if (!len) + return 0; + if (down_write_killable(&mm->mmap_sem)) return -EINTR;
_
Patches currently in -mm which might be from mhocko@suse.com are
memblock-do-not-complain-about-top-down-allocations-for-memory_hotremove.patch mm-do-not-bug_on-on-incorrect-lenght-in-__mm_populate.patch mm-drop-vm_bug_on-from-__get_free_pages.patch memcg-oom-move-out_of_memory-back-to-the-charge-path.patch mm-oom-docs-describe-the-cgroup-aware-oom-killer-fix-2.patch