December 2018 - Linux-stable-mirror

[merged] mm-khugepaged-collapse_shmem-stop-if-punched-or-truncated.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/khugepaged: collapse_shmem() stop if punched or truncated has been removed from the -mm tree. Its filename was mm-khugepaged-collapse_shmem-stop-if-punched-or-truncated.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm/khugepaged: collapse_shmem() stop if punched or truncated Huge tmpfs testing showed that although collapse_shmem() recognizes a concurrently truncated or hole-punched page correctly, its handling of holes was liable to refill an emptied extent. Add check to stop that. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261522040.2275@eggly.anvils Fixes: f3f0e1d2150b2 ("khugepaged: add support of collapse for tmpfs/shmem pages") Signed-off-by: Hugh Dickins <hughd(a)google.com> Reviewed-by: Matthew Wilcox <willy(a)infradead.org> Cc: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Jerome Glisse <jglisse(a)redhat.com> Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru> Cc: <stable(a)vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/khugepaged.c | 11 +++++++++++ 1 file changed, 11 insertions(+) --- a/mm/khugepaged.c~mm-khugepaged-collapse_shmem-stop-if-punched-or-truncated +++ a/mm/khugepaged.c @@ -1359,6 +1359,17 @@ static void collapse_shmem(struct mm_str VM_BUG_ON(index != xas.xa_index); if (!page) { + /* + * Stop if extent has been truncated or hole-punched, + * and is now completely empty. + */ + if (index == start) { + if (!xas_next_entry(&xas, end - 1)) { + result = SCAN_TRUNCATED; + break; + } + xas_set(&xas, index); + } if (!shmem_charge(mapping->host, 1)) { result = SCAN_FAIL; break; _ Patches currently in -mm which might be from hughd(a)google.com are mm-put_and_wait_on_page_locked-while-page-is-migrated.patch

6 years, 7 months

1
0
0 0

[merged] mm-huge_memory-fix-lockdep-complaint-on-32-bit-i_size_read.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() has been removed from the -mm tree. Its filename was mm-huge_memory-fix-lockdep-complaint-on-32-bit-i_size_read.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm/huge_memory: fix lockdep complaint on 32-bit i_size_read() Huge tmpfs testing, on 32-bit kernel with lockdep enabled, showed that __split_huge_page() was using i_size_read() while holding the irq-safe lru_lock and page tree lock, but the 32-bit i_size_read() uses an irq-unsafe seqlock which should not be nested inside them. Instead, read the i_size earlier in split_huge_page_to_list(), and pass the end offset down to __split_huge_page(): all while holding head page lock, which is enough to prevent truncation of that extent before the page tree lock has been taken. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261520070.2275@eggly.anvils Fixes: baa355fd33142 ("thp: file pages support for split_huge_page()") Signed-off-by: Hugh Dickins <hughd(a)google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Jerome Glisse <jglisse(a)redhat.com> Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-) --- a/mm/huge_memory.c~mm-huge_memory-fix-lockdep-complaint-on-32-bit-i_size_read +++ a/mm/huge_memory.c @@ -2439,12 +2439,11 @@ static void __split_huge_page_tail(struc } static void __split_huge_page(struct page *page, struct list_head *list, - unsigned long flags) + pgoff_t end, unsigned long flags) { struct page *head = compound_head(page); struct zone *zone = page_zone(head); struct lruvec *lruvec; - pgoff_t end = -1; int i; lruvec = mem_cgroup_page_lruvec(head, zone->zone_pgdat); @@ -2452,9 +2451,6 @@ static void __split_huge_page(struct pag /* complete memcg works before add pages to LRU */ mem_cgroup_split_huge_fixup(head); - if (!PageAnon(page)) - end = DIV_ROUND_UP(i_size_read(head->mapping->host), PAGE_SIZE); - for (i = HPAGE_PMD_NR - 1; i >= 1; i--) { __split_huge_page_tail(head, i, lruvec, list); /* Some pages can be beyond i_size: drop them from page cache */ @@ -2626,6 +2622,7 @@ int split_huge_page_to_list(struct page int count, mapcount, extra_pins, ret; bool mlocked; unsigned long flags; + pgoff_t end; VM_BUG_ON_PAGE(is_huge_zero_page(page), page); VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -2648,6 +2645,7 @@ int split_huge_page_to_list(struct page ret = -EBUSY; goto out; } + end = -1; mapping = NULL; anon_vma_lock_write(anon_vma); } else { @@ -2661,6 +2659,15 @@ int split_huge_page_to_list(struct page anon_vma = NULL; i_mmap_lock_read(mapping); + + /* + *__split_huge_page() may need to trim off pages beyond EOF: + * but on 32-bit, i_size_read() takes an irq-unsafe seqlock, + * which cannot be nested inside the page tree lock. So note + * end now: i_size itself may be changed at any moment, but + * head page lock is good enough to serialize the trimming. + */ + end = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); } /* @@ -2707,7 +2714,7 @@ int split_huge_page_to_list(struct page if (mapping) __dec_node_page_state(page, NR_SHMEM_THPS); spin_unlock(&pgdata->split_queue_lock); - __split_huge_page(page, list, flags); + __split_huge_page(page, list, end, flags); if (PageSwapCache(head)) { swp_entry_t entry = { .val = page_private(head) }; _ Patches currently in -mm which might be from hughd(a)google.com are mm-put_and_wait_on_page_locked-while-page-is-migrated.patch

6 years, 7 months

1
0
0 0

[merged] mm-huge_memory-splitting-set-mappingindex-before-unfreeze.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/huge_memory: splitting set mapping+index before unfreeze has been removed from the -mm tree. Its filename was mm-huge_memory-splitting-set-mappingindex-before-unfreeze.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm/huge_memory: splitting set mapping+index before unfreeze Huge tmpfs stress testing has occasionally hit shmem_undo_range()'s VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page). Move the setting of mapping and index up before the page_ref_unfreeze() in __split_huge_page_tail() to fix this: so that a page cache lookup cannot get a reference while the tail's mapping and index are unstable. In fact, might as well move them up before the smp_wmb(): I don't see an actual need for that, but if I'm missing something, this way round is safer than the other, and no less efficient. You might argue that VM_BUG_ON_PAGE(page_to_pgoff(page) != index, page) is misplaced, and should be left until after the trylock_page(); but left as is has not crashed since, and gives more stringent assurance. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261516380.2275@eggly.anvils Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()") Requires: 605ca5ede764 ("mm/huge_memory.c: reorder operations in __split_huge_page_tail()") Signed-off-by: Hugh Dickins <hughd(a)google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru> Cc: Jerome Glisse <jglisse(a)redhat.com> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) --- a/mm/huge_memory.c~mm-huge_memory-splitting-set-mappingindex-before-unfreeze +++ a/mm/huge_memory.c @@ -2402,6 +2402,12 @@ static void __split_huge_page_tail(struc (1L << PG_unevictable) | (1L << PG_dirty))); + /* ->mapping in first tail page is compound_mapcount */ + VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, + page_tail); + page_tail->mapping = head->mapping; + page_tail->index = head->index + tail; + /* Page flags must be visible before we make the page non-compound. */ smp_wmb(); @@ -2422,12 +2428,6 @@ static void __split_huge_page_tail(struc if (page_is_idle(head)) set_page_idle(page_tail); - /* ->mapping in first tail page is compound_mapcount */ - VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, - page_tail); - page_tail->mapping = head->mapping; - - page_tail->index = head->index + tail; page_cpupid_xchg_last(page_tail, page_cpupid_last(head)); /* _ Patches currently in -mm which might be from hughd(a)google.com are mm-put_and_wait_on_page_locked-while-page-is-migrated.patch

6 years, 7 months

1
0
0 0

[merged] mm-huge_memory-rename-freeze_page-to-unmap_page.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm/huge_memory: rename freeze_page() to unmap_page() has been removed from the -mm tree. Its filename was mm-huge_memory-rename-freeze_page-to-unmap_page.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Hugh Dickins <hughd(a)google.com> Subject: mm/huge_memory: rename freeze_page() to unmap_page() The term "freeze" is used in several ways in the kernel, and in mm it has the particular meaning of forcing page refcount temporarily to 0. freeze_page() is just too confusing a name for a function that unmaps a page: rename it unmap_page(), and rename unfreeze_page() remap_page(). Went to change the mention of freeze_page() added later in mm/rmap.c, but found it to be incorrect: ordinary page reclaim reaches there too; but the substance of the comment still seems correct, so edit it down. Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1811261514080.2275@eggly.anvils Fixes: e9b61f19858a5 ("thp: reintroduce split_huge_page()") Signed-off-by: Hugh Dickins <hughd(a)google.com> Acked-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com> Cc: Jerome Glisse <jglisse(a)redhat.com> Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru> Cc: Matthew Wilcox <willy(a)infradead.org> Cc: <stable(a)vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/huge_memory.c | 12 ++++++------ mm/rmap.c | 13 +++---------- 2 files changed, 9 insertions(+), 16 deletions(-) --- a/mm/huge_memory.c~mm-huge_memory-rename-freeze_page-to-unmap_page +++ a/mm/huge_memory.c @@ -2350,7 +2350,7 @@ void vma_adjust_trans_huge(struct vm_are } } -static void freeze_page(struct page *page) +static void unmap_page(struct page *page) { enum ttu_flags ttu_flags = TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | TTU_RMAP_LOCKED | TTU_SPLIT_HUGE_PMD; @@ -2365,7 +2365,7 @@ static void freeze_page(struct page *pag VM_BUG_ON_PAGE(!unmap_success, page); } -static void unfreeze_page(struct page *page) +static void remap_page(struct page *page) { int i; if (PageTransHuge(page)) { @@ -2483,7 +2483,7 @@ static void __split_huge_page(struct pag spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags); - unfreeze_page(head); + remap_page(head); for (i = 0; i < HPAGE_PMD_NR; i++) { struct page *subpage = head + i; @@ -2664,7 +2664,7 @@ int split_huge_page_to_list(struct page } /* - * Racy check if we can split the page, before freeze_page() will + * Racy check if we can split the page, before unmap_page() will * split PMDs */ if (!can_split_huge_page(head, &extra_pins)) { @@ -2673,7 +2673,7 @@ int split_huge_page_to_list(struct page } mlocked = PageMlocked(page); - freeze_page(head); + unmap_page(head); VM_BUG_ON_PAGE(compound_mapcount(head), head); /* Make sure the page is not on per-CPU pagevec as it takes pin */ @@ -2727,7 +2727,7 @@ int split_huge_page_to_list(struct page fail: if (mapping) xa_unlock(&mapping->i_pages); spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags); - unfreeze_page(head); + remap_page(head); ret = -EBUSY; } --- a/mm/rmap.c~mm-huge_memory-rename-freeze_page-to-unmap_page +++ a/mm/rmap.c @@ -1627,16 +1627,9 @@ static bool try_to_unmap_one(struct page address + PAGE_SIZE); } else { /* - * We should not need to notify here as we reach this - * case only from freeze_page() itself only call from - * split_huge_page_to_list() so everything below must - * be true: - * - page is not anonymous - * - page is locked - * - * So as it is a locked file back page thus it can not - * be remove from the page cache and replace by a new - * page before mmu_notifier_invalidate_range_end so no + * This is a locked file-backed page, thus it cannot + * be removed from the page cache and replaced by a new + * page before mmu_notifier_invalidate_range_end, so no * concurrent thread might update its page table to * point at new page while a device still is using this * page. _ Patches currently in -mm which might be from hughd(a)google.com are mm-put_and_wait_on_page_locked-while-page-is-migrated.patch

6 years, 7 months

1
0
0 0

[merged] userfaultfd-shmem-add-i_size-checks.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: shmem: add i_size checks has been removed from the -mm tree. Its filename was userfaultfd-shmem-add-i_size-checks.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Andrea Arcangeli <aarcange(a)redhat.com> Subject: userfaultfd: shmem: add i_size checks With MAP_SHARED: recheck the i_size after taking the PT lock, to serialize against truncate with the PT lock. Delete the page from the pagecache if the i_size_read check fails. With MAP_PRIVATE: check the i_size after the PT lock before mapping anonymous memory or zeropages into the MAP_PRIVATE shmem mapping. A mostly irrelevant cleanup: like we do the delete_from_page_cache() pagecache removal after dropping the PT lock, the PT lock is a spinlock so drop it before the sleepable page lock. Link: http://lkml.kernel.org/r/20181126173452.26955-5-aarcange@redhat.com Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Reviewed-by: Hugh Dickins <hughd(a)google.com> Reported-by: Jann Horn <jannh(a)google.com> Cc: <stable(a)vger.kernel.org> Cc: "Dr. David Alan Gilbert" <dgilbert(a)redhat.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/shmem.c | 18 ++++++++++++++++-- mm/userfaultfd.c | 26 ++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 4 deletions(-) --- a/mm/shmem.c~userfaultfd-shmem-add-i_size-checks +++ a/mm/shmem.c @@ -2216,6 +2216,7 @@ static int shmem_mfill_atomic_pte(struct struct page *page; pte_t _dst_pte, *dst_pte; int ret; + pgoff_t offset, max_off; ret = -ENOMEM; if (!shmem_inode_acct_block(inode, 1)) @@ -2253,6 +2254,12 @@ static int shmem_mfill_atomic_pte(struct __SetPageSwapBacked(page); __SetPageUptodate(page); + ret = -EFAULT; + offset = linear_page_index(dst_vma, dst_addr); + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + if (unlikely(offset >= max_off)) + goto out_release; + ret = mem_cgroup_try_charge_delay(page, dst_mm, gfp, &memcg, false); if (ret) goto out_release; @@ -2268,8 +2275,14 @@ static int shmem_mfill_atomic_pte(struct if (dst_vma->vm_flags & VM_WRITE) _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); - ret = -EEXIST; dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); + + ret = -EFAULT; + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + if (unlikely(offset >= max_off)) + goto out_release_uncharge_unlock; + + ret = -EEXIST; if (!pte_none(*dst_pte)) goto out_release_uncharge_unlock; @@ -2287,13 +2300,14 @@ static int shmem_mfill_atomic_pte(struct /* No need to invalidate - it was non-present before */ update_mmu_cache(dst_vma, dst_addr, dst_pte); - unlock_page(page); pte_unmap_unlock(dst_pte, ptl); + unlock_page(page); ret = 0; out: return ret; out_release_uncharge_unlock: pte_unmap_unlock(dst_pte, ptl); + delete_from_page_cache(page); out_release_uncharge: mem_cgroup_cancel_charge(page, memcg, false); out_release: --- a/mm/userfaultfd.c~userfaultfd-shmem-add-i_size-checks +++ a/mm/userfaultfd.c @@ -33,6 +33,8 @@ static int mcopy_atomic_pte(struct mm_st void *page_kaddr; int ret; struct page *page; + pgoff_t offset, max_off; + struct inode *inode; if (!*pagep) { ret = -ENOMEM; @@ -73,8 +75,17 @@ static int mcopy_atomic_pte(struct mm_st if (dst_vma->vm_flags & VM_WRITE) _dst_pte = pte_mkwrite(pte_mkdirty(_dst_pte)); - ret = -EEXIST; dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); + if (dst_vma->vm_file) { + /* the shmem MAP_PRIVATE case requires checking the i_size */ + inode = dst_vma->vm_file->f_inode; + offset = linear_page_index(dst_vma, dst_addr); + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + ret = -EFAULT; + if (unlikely(offset >= max_off)) + goto out_release_uncharge_unlock; + } + ret = -EEXIST; if (!pte_none(*dst_pte)) goto out_release_uncharge_unlock; @@ -108,11 +119,22 @@ static int mfill_zeropage_pte(struct mm_ pte_t _dst_pte, *dst_pte; spinlock_t *ptl; int ret; + pgoff_t offset, max_off; + struct inode *inode; _dst_pte = pte_mkspecial(pfn_pte(my_zero_pfn(dst_addr), dst_vma->vm_page_prot)); - ret = -EEXIST; dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); + if (dst_vma->vm_file) { + /* the shmem MAP_PRIVATE case requires checking the i_size */ + inode = dst_vma->vm_file->f_inode; + offset = linear_page_index(dst_vma, dst_addr); + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + ret = -EFAULT; + if (unlikely(offset >= max_off)) + goto out_unlock; + } + ret = -EEXIST; if (!pte_none(*dst_pte)) goto out_unlock; set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); _ Patches currently in -mm which might be from aarcange(a)redhat.com are

6 years, 7 months

1
0
0 0

[merged] userfaultfd-shmem-hugetlbfs-only-allow-to-register-vm_maywrite-vmas.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas has been removed from the -mm tree. Its filename was userfaultfd-shmem-hugetlbfs-only-allow-to-register-vm_maywrite-vmas.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Andrea Arcangeli <aarcange(a)redhat.com> Subject: userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas After the VMA to register the uffd onto is found, check that it has VM_MAYWRITE set before allowing registration. This way we inherit all common code checks before allowing to fill file holes in shmem and hugetlbfs with UFFDIO_COPY. The userfaultfd memory model is not applicable for readonly files unless it's a MAP_PRIVATE. Link: http://lkml.kernel.org/r/20181126173452.26955-4-aarcange@redhat.com Fixes: ff62a3421044 ("hugetlb: implement memfd sealing") Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Reviewed-by: Hugh Dickins <hughd(a)google.com> Reported-by: Jann Horn <jannh(a)google.com> Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Cc: <stable(a)vger.kernel.org> Cc: "Dr. David Alan Gilbert" <dgilbert(a)redhat.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/userfaultfd.c | 15 +++++++++++++++ mm/userfaultfd.c | 15 ++++++--------- 2 files changed, 21 insertions(+), 9 deletions(-) --- a/fs/userfaultfd.c~userfaultfd-shmem-hugetlbfs-only-allow-to-register-vm_maywrite-vmas +++ a/fs/userfaultfd.c @@ -1361,6 +1361,19 @@ static int userfaultfd_register(struct u ret = -EINVAL; if (!vma_can_userfault(cur)) goto out_unlock; + + /* + * UFFDIO_COPY will fill file holes even without + * PROT_WRITE. This check enforces that if this is a + * MAP_SHARED, the process has write permission to the backing + * file. If VM_MAYWRITE is set it also enforces that on a + * MAP_SHARED vma: there is no F_WRITE_SEAL and no further + * F_WRITE_SEAL can be taken until the vma is destroyed. + */ + ret = -EPERM; + if (unlikely(!(cur->vm_flags & VM_MAYWRITE))) + goto out_unlock; + /* * If this vma contains ending address, and huge pages * check alignment. @@ -1406,6 +1419,7 @@ static int userfaultfd_register(struct u BUG_ON(!vma_can_userfault(vma)); BUG_ON(vma->vm_userfaultfd_ctx.ctx && vma->vm_userfaultfd_ctx.ctx != ctx); + WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); /* * Nothing to do: this vma is already registered into this @@ -1552,6 +1566,7 @@ static int userfaultfd_unregister(struct cond_resched(); BUG_ON(!vma_can_userfault(vma)); + WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); /* * Nothing to do: this vma is already registered into this --- a/mm/userfaultfd.c~userfaultfd-shmem-hugetlbfs-only-allow-to-register-vm_maywrite-vmas +++ a/mm/userfaultfd.c @@ -205,8 +205,9 @@ retry: if (!dst_vma || !is_vm_hugetlb_page(dst_vma)) goto out_unlock; /* - * Only allow __mcopy_atomic_hugetlb on userfaultfd - * registered ranges. + * Check the vma is registered in uffd, this is + * required to enforce the VM_MAYWRITE check done at + * uffd registration time. */ if (!dst_vma->vm_userfaultfd_ctx.ctx) goto out_unlock; @@ -459,13 +460,9 @@ retry: if (!dst_vma) goto out_unlock; /* - * Be strict and only allow __mcopy_atomic on userfaultfd - * registered ranges to prevent userland errors going - * unnoticed. As far as the VM consistency is concerned, it - * would be perfectly safe to remove this check, but there's - * no useful usage for __mcopy_atomic ouside of userfaultfd - * registered ranges. This is after all why these are ioctls - * belonging to the userfaultfd and not syscalls. + * Check the vma is registered in uffd, this is required to + * enforce the VM_MAYWRITE check done at uffd registration + * time. */ if (!dst_vma->vm_userfaultfd_ctx.ctx) goto out_unlock; _ Patches currently in -mm which might be from aarcange(a)redhat.com are

6 years, 7 months

1
0
0 0

[merged] userfaultfd-shmem-allocate-anonymous-memory-for-map_private-shmem.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem has been removed from the -mm tree. Its filename was userfaultfd-shmem-allocate-anonymous-memory-for-map_private-shmem.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Andrea Arcangeli <aarcange(a)redhat.com> Subject: userfaultfd: shmem: allocate anonymous memory for MAP_PRIVATE shmem Userfaultfd did not create private memory when UFFDIO_COPY was invoked on a MAP_PRIVATE shmem mapping. Instead it wrote to the shmem file, even when that had not been opened for writing. Though, fortunately, that could only happen where there was a hole in the file. Fix the shmem-backed implementation of UFFDIO_COPY to create private memory for MAP_PRIVATE mappings. The hugetlbfs-backed implementation was already correct. This change is visible to userland, if userfaultfd has been used in unintended ways: so it introduces a small risk of incompatibility, but is necessary in order to respect file permissions. An app that uses UFFDIO_COPY for anything like postcopy live migration won't notice the difference, and in fact it'll run faster because there will be no copy-on-write and memory waste in the tmpfs pagecache anymore. Userfaults on MAP_PRIVATE shmem keep triggering only on file holes like before. The real zeropage can also be built on a MAP_PRIVATE shmem mapping through UFFDIO_ZEROPAGE and that's safe because the zeropage pte is never dirty, in turn even an mprotect upgrading the vma permission from PROT_READ to PROT_READ|PROT_WRITE won't make the zeropage pte writable. Link: http://lkml.kernel.org/r/20181126173452.26955-3-aarcange@redhat.com Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com> Reported-by: Mike Rapoport <rppt(a)linux.ibm.com> Reviewed-by: Hugh Dickins <hughd(a)google.com> Cc: <stable(a)vger.kernel.org> Cc: "Dr. David Alan Gilbert" <dgilbert(a)redhat.com> Cc: Jann Horn <jannh(a)google.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: stable(a)vger.kernel.org Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/userfaultfd.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) --- a/mm/userfaultfd.c~userfaultfd-shmem-allocate-anonymous-memory-for-map_private-shmem +++ a/mm/userfaultfd.c @@ -380,7 +380,17 @@ static __always_inline ssize_t mfill_ato { ssize_t err; - if (vma_is_anonymous(dst_vma)) { + /* + * The normal page fault path for a shmem will invoke the + * fault, fill the hole in the file and COW it right away. The + * result generates plain anonymous memory. So when we are + * asked to fill an hole in a MAP_PRIVATE shmem mapping, we'll + * generate anonymous memory directly without actually filling + * the hole. For the MAP_PRIVATE case the robustness check + * only happens in the pagetable (to verify it's still none) + * and not in the radix tree. + */ + if (!(dst_vma->vm_flags & VM_SHARED)) { if (!zeropage) err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr, src_addr, page); @@ -489,7 +499,8 @@ retry: * dst_vma. */ err = -ENOMEM; - if (vma_is_anonymous(dst_vma) && unlikely(anon_vma_prepare(dst_vma))) + if (!(dst_vma->vm_flags & VM_SHARED) && + unlikely(anon_vma_prepare(dst_vma))) goto out_unlock; while (src_addr < src_start + len) { _ Patches currently in -mm which might be from aarcange(a)redhat.com are

6 years, 7 months

1
0
0 0

[merged] userfaultfd-use-enoent-instead-of-efault-if-the-atomic-copy-user-fails.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails has been removed from the -mm tree. Its filename was userfaultfd-use-enoent-instead-of-efault-if-the-atomic-copy-user-fails.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Andrea Arcangeli <aarcange(a)redhat.com> Subject: userfaultfd: use ENOENT instead of EFAULT if the atomic copy user fails Patch series "userfaultfd shmem updates". Jann found two bugs in the userfaultfd shmem MAP_SHARED backend: the lack of the VM_MAYWRITE check and the lack of i_size checks. Then looking into the above we also fixed the MAP_PRIVATE case. Hugh by source review also found a data loss source if UFFDIO_COPY is used on shmem MAP_SHARED PROT_READ mappings (the production usages incidentally run with PROT_READ|PROT_WRITE, so the data loss couldn't happen in those production usages like with QEMU). The whole patchset is marked for stable. We verified QEMU postcopy live migration with guest running on shmem MAP_PRIVATE run as well as before after the fix of shmem MAP_PRIVATE. Regardless if it's shmem or hugetlbfs or MAP_PRIVATE or MAP_SHARED, QEMU unconditionally invokes a punch hole if the guest mapping is filebacked and a MADV_DONTNEED too (needed to get rid of the MAP_PRIVATE COWs and for the anon backend). This patch (of 5): We internally used EFAULT to communicate with the caller, switch to ENOENT, so EFAULT can be used as a non internal retval. Link: http://lkml.kernel.org/r/20181126173452.26955-2-aarcange@redhat.com Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support") Signed-off-by: Andrea Arcangeli <aarcange(a)redhat.com> Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com> Reviewed-by: Hugh Dickins <hughd(a)google.com> Cc: Mike Kravetz <mike.kravetz(a)oracle.com> Cc: Jann Horn <jannh(a)google.com> Cc: Peter Xu <peterx(a)redhat.com> Cc: "Dr. David Alan Gilbert" <dgilbert(a)redhat.com> Cc: <stable(a)vger.kernel.org> Cc: stable(a)vger.kernel.org Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/hugetlb.c | 2 +- mm/shmem.c | 2 +- mm/userfaultfd.c | 6 +++--- 3 files changed, 5 insertions(+), 5 deletions(-) --- a/mm/hugetlb.c~userfaultfd-use-enoent-instead-of-efault-if-the-atomic-copy-user-fails +++ a/mm/hugetlb.c @@ -4080,7 +4080,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_s /* fallback to copy_from_user outside mmap_sem */ if (unlikely(ret)) { - ret = -EFAULT; + ret = -ENOENT; *pagep = page; /* don't free the page */ goto out; --- a/mm/shmem.c~userfaultfd-use-enoent-instead-of-efault-if-the-atomic-copy-user-fails +++ a/mm/shmem.c @@ -2238,7 +2238,7 @@ static int shmem_mfill_atomic_pte(struct *pagep = page; shmem_inode_unacct_blocks(inode, 1); /* don't free the page */ - return -EFAULT; + return -ENOENT; } } else { /* mfill_zeropage_atomic */ clear_highpage(page); --- a/mm/userfaultfd.c~userfaultfd-use-enoent-instead-of-efault-if-the-atomic-copy-user-fails +++ a/mm/userfaultfd.c @@ -48,7 +48,7 @@ static int mcopy_atomic_pte(struct mm_st /* fallback to copy_from_user outside mmap_sem */ if (unlikely(ret)) { - ret = -EFAULT; + ret = -ENOENT; *pagep = page; /* don't free the page */ goto out; @@ -274,7 +274,7 @@ retry: cond_resched(); - if (unlikely(err == -EFAULT)) { + if (unlikely(err == -ENOENT)) { up_read(&dst_mm->mmap_sem); BUG_ON(!page); @@ -530,7 +530,7 @@ retry: src_addr, &page, zeropage); cond_resched(); - if (unlikely(err == -EFAULT)) { + if (unlikely(err == -ENOENT)) { void *page_kaddr; up_read(&dst_mm->mmap_sem); _ Patches currently in -mm which might be from aarcange(a)redhat.com are

6 years, 7 months

1
0
0 0

[merged] test_kmod-fix-rmmod-double-free.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: lib/test_kmod.c: fix rmmod double free has been removed from the -mm tree. Its filename was test_kmod-fix-rmmod-double-free.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Luis Chamberlain <mcgrof(a)kernel.org> Subject: lib/test_kmod.c: fix rmmod double free We free the misc device string twice on rmmod; fix this. Without this we cannot remove the module without crashing. Link: http://lkml.kernel.org/r/20181124050500.5257-1-mcgrof@kernel.org Signed-off-by: Luis Chamberlain <mcgrof(a)kernel.org> Reported-by: Randy Dunlap <rdunlap(a)infradead.org> Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org> Cc: <stable(a)vger.kernel.org> [4.12+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- lib/test_kmod.c | 1 - 1 file changed, 1 deletion(-) --- a/lib/test_kmod.c~test_kmod-fix-rmmod-double-free +++ a/lib/test_kmod.c @@ -1214,7 +1214,6 @@ void unregister_test_dev_kmod(struct kmo dev_info(test_dev->dev, "removing interface\n"); misc_deregister(&test_dev->misc_dev); - kfree(&test_dev->misc_dev.name); mutex_unlock(&test_dev->config_mutex); mutex_unlock(&test_dev->trigger_mutex); _ Patches currently in -mm which might be from mcgrof(a)kernel.org are

6 years, 7 months

1
0
0 0

[merged] mm-use-swp_offset-as-key-in-shmem_replace_page.patch removed from -mm tree

by akpm＠linux-foundation.org

The patch titled Subject: mm: use swp_offset as key in shmem_replace_page() has been removed from the -mm tree. Its filename was mm-use-swp_offset-as-key-in-shmem_replace_page.patch This patch was dropped because it was merged into mainline or a subsystem tree ------------------------------------------------------ From: Yu Zhao <yuzhao(a)google.com> Subject: mm: use swp_offset as key in shmem_replace_page() We changed the key of swap cache tree from swp_entry_t.val to swp_offset. We need to do so in shmem_replace_page() as well. Hugh said: : shmem_replace_page() has been wrong since the day I wrote it: good : enough to work on swap "type" 0, which is all most people ever use : (especially those few who need shmem_replace_page() at all), but broken : once there are any non-0 swp_type bits set in the higher order bits. Link: http://lkml.kernel.org/r/20181121215442.138545-1-yuzhao@google.com Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache") Signed-off-by: Yu Zhao <yuzhao(a)google.com> Reviewed-by: Matthew Wilcox <willy(a)infradead.org> Acked-by: Hugh Dickins <hughd(a)google.com> Cc: <stable(a)vger.kernel.org> [4.9+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/shmem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) --- a/mm/shmem.c~mm-use-swp_offset-as-key-in-shmem_replace_page +++ a/mm/shmem.c @@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct pag { struct page *oldpage, *newpage; struct address_space *swap_mapping; + swp_entry_t entry; pgoff_t swap_index; int error; oldpage = *pagep; - swap_index = page_private(oldpage); + entry.val = page_private(oldpage); + swap_index = swp_offset(entry); swap_mapping = page_mapping(oldpage); /* @@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct pag __SetPageLocked(newpage); __SetPageSwapBacked(newpage); SetPageUptodate(newpage); - set_page_private(newpage, swap_index); + set_page_private(newpage, entry.val); SetPageSwapCache(newpage); /* _ Patches currently in -mm which might be from yuzhao(a)google.com are mm-remove-pte_lock_deinit.patch mm-dont-expose-page-to-fast-gup-before-its-ready.patch

6 years, 7 months

1
0
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror December 2018