August 2024 - Linux-stable-mirror

+ squashfs-sanity-check-symbolic-link-size.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: Squashfs: sanity check symbolic link size has been added to the -mm mm-hotfixes-unstable branch. Its filename is squashfs-sanity-check-symbolic-link-size.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Phillip Lougher <phillip(a)squashfs.org.uk> Subject: Squashfs: sanity check symbolic link size Date: Sun, 11 Aug 2024 21:13:01 +0100 Syzkiller reports a "KMSAN: uninit-value in pick_link" bug. This is caused by an uninitialised page, which is ultimately caused by a corrupted symbolic link size read from disk. The reason why the corrupted symlink size causes an uninitialised page is due to the following sequence of events: 1. squashfs_read_inode() is called to read the symbolic link from disk. This assigns the corrupted value 3875536935 to inode->i_size. 2. Later squashfs_symlink_read_folio() is called, which assigns this corrupted value to the length variable, which being a signed int, overflows producing a negative number. 3. The following loop that fills in the page contents checks that the copied bytes is less than length, which being negative means the loop is skipped, producing an unitialised page. This patch adds a sanity check which checks that the symbolic link size is not larger than expected. Link: https://lkml.kernel.org/r/20240811201301.13076-1-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher <phillip(a)squashfs.org.uk> Reported-by: Lizhi Xu <lizhi.xu(a)windriver.com> Reported-by: syzbot+24ac24ff58dc5b0d26b9(a)syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000a90e8c061e86a76b@google.com/ Cc: Al Viro <viro(a)zeniv.linux.org.uk> Cc: Christian Brauner <brauner(a)kernel.org> Cc: Jan Kara <jack(a)suse.cz> Cc: Phillip Lougher <phillip(a)squashfs.org.uk> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- fs/squashfs/inode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) --- a/fs/squashfs/inode.c~squashfs-sanity-check-symbolic-link-size +++ a/fs/squashfs/inode.c @@ -279,8 +279,13 @@ int squashfs_read_inode(struct inode *in if (err < 0) goto failed_read; - set_nlink(inode, le32_to_cpu(sqsh_ino->nlink)); inode->i_size = le32_to_cpu(sqsh_ino->symlink_size); + if (inode->i_size > PAGE_SIZE) { + ERROR("Corrupted symlink\n"); + return -EINVAL; + } + + set_nlink(inode, le32_to_cpu(sqsh_ino->nlink)); inode->i_op = &squashfs_symlink_inode_ops; inode_nohighmem(inode); inode->i_data.a_ops = &squashfs_symlink_aops; _ Patches currently in -mm which might be from phillip(a)squashfs.org.uk are squashfs-sanity-check-symbolic-link-size.patch

10 months, 4 weeks

1
0
0 0

+ userfaultfd-dont-bug_on-if-khugepaged-yanks-our-page-table.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: userfaultfd: don't BUG_ON() if khugepaged yanks our page table has been added to the -mm mm-hotfixes-unstable branch. Its filename is userfaultfd-dont-bug_on-if-khugepaged-yanks-our-page-table.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Jann Horn <jannh(a)google.com> Subject: userfaultfd: don't BUG_ON() if khugepaged yanks our page table Date: Tue, 13 Aug 2024 22:25:22 +0200 Since khugepaged was changed to allow retracting page tables in file mappings without holding the mmap lock, these BUG_ON()s are wrong - get rid of them. We could also remove the preceding "if (unlikely(...))" block, but then we could reach pte_offset_map_lock() with transhuge pages not just for file mappings but also for anonymous mappings - which would probably be fine but I think is not necessarily expected. Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-2-5efa61078a41@goog… Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock") Signed-off-by: Jann Horn <jannh(a)google.com> Reviewed-by: Qi Zheng <zhengqi.arch(a)bytedance.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Pavel Emelyanov <xemul(a)virtuozzo.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/userfaultfd.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) --- a/mm/userfaultfd.c~userfaultfd-dont-bug_on-if-khugepaged-yanks-our-page-table +++ a/mm/userfaultfd.c @@ -807,9 +807,10 @@ retry: err = -EFAULT; break; } - - BUG_ON(pmd_none(*dst_pmd)); - BUG_ON(pmd_trans_huge(*dst_pmd)); + /* + * For shmem mappings, khugepaged is allowed to remove page + * tables under us; pte_offset_map_lock() will deal with that. + */ err = mfill_atomic_pte(dst_pmd, dst_vma, dst_addr, src_addr, flags, &folio); _ Patches currently in -mm which might be from jannh(a)google.com are userfaultfd-fix-checks-for-huge-pmds.patch userfaultfd-dont-bug_on-if-khugepaged-yanks-our-page-table.patch mm-fix-harmless-type-confusion-in-lock_vma_under_rcu.patch

10 months, 4 weeks

1
0
0 0

+ userfaultfd-fix-checks-for-huge-pmds.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: userfaultfd: fix checks for huge PMDs has been added to the -mm mm-hotfixes-unstable branch. Its filename is userfaultfd-fix-checks-for-huge-pmds.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Jann Horn <jannh(a)google.com> Subject: userfaultfd: fix checks for huge PMDs Date: Tue, 13 Aug 2024 22:25:21 +0200 Patch series "userfaultfd: fix races around pmd_trans_huge() check", v2. The pmd_trans_huge() code in mfill_atomic() is wrong in three different ways depending on kernel version: 1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit the right two race windows) - I've tested this in a kernel build with some extra mdelay() calls. See the commit message for a description of the race scenario. On older kernels (before 6.5), I think the same bug can even theoretically lead to accessing transhuge page contents as a page table if you hit the right 5 narrow race windows (I haven't tested this case). 2. As pointed out by Qi Zheng, pmd_trans_huge() is not sufficient for detecting PMDs that don't point to page tables. On older kernels (before 6.5), you'd just have to win a single fairly wide race to hit this. I've tested this on 6.1 stable by racing migration (with a mdelay() patched into try_to_migrate()) against UFFDIO_ZEROPAGE - on my x86 VM, that causes a kernel oops in ptlock_ptr(). 3. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed to yank page tables out from under us (though I haven't tested that), so I think the BUG_ON() checks in mfill_atomic() are just wrong. I decided to write two separate fixes for these (one fix for bugs 1+2, one fix for bug 3), so that the first fix can be backported to kernels affected by bugs 1+2. This patch (of 2): This fixes two issues. I discovered that the following race can occur: mfill_atomic other thread ============ ============ <zap PMD> pmdp_get_lockless() [reads none pmd] <bail if trans_huge> <if none:> <pagefault creates transhuge zeropage> __pte_alloc [no-op] <zap PMD> <bail if pmd_trans_huge(*dst_pmd)> BUG_ON(pmd_none(*dst_pmd)) I have experimentally verified this in a kernel with extra mdelay() calls; the BUG_ON(pmd_none(*dst_pmd)) triggers. On kernels newer than commit 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail"), this can't lead to anything worse than a BUG_ON(), since the page table access helpers are actually designed to deal with page tables concurrently disappearing; but on older kernels (<=6.4), I think we could probably theoretically race past the two BUG_ON() checks and end up treating a hugepage as a page table. The second issue is that, as Qi Zheng pointed out, there are other types of huge PMDs that pmd_trans_huge() can't catch: devmap PMDs and swap PMDs (in particular, migration PMDs). On <=6.4, this is worse than the first issue: If mfill_atomic() runs on a PMD that contains a migration entry (which just requires winning a single, fairly wide race), it will pass the PMD to pte_offset_map_lock(), which assumes that the PMD points to a page table. Breakage follows: First, the kernel tries to take the PTE lock (which will crash or maybe worse if there is no "struct page" for the address bits in the migration entry PMD - I think at least on X86 there usually is no corresponding "struct page" thanks to the PTE inversion mitigation, amd64 looks different). If that didn't crash, the kernel would next try to write a PTE into what it wrongly thinks is a page table. As part of fixing these issues, get rid of the check for pmd_trans_huge() before __pte_alloc() - that's redundant, we're going to have to check for that after the __pte_alloc() anyway. Backport note: pmdp_get_lockless() is pmd_read_atomic() in older kernels. Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-0-5efa61078a41@goog… Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-1-5efa61078a41@goog… Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation") Signed-off-by: Jann Horn <jannh(a)google.com> Reported-by: Qi Zheng <zhengqi.arch(a)bytedance.com> Closes: https://lore.kernel.org/r/59bf3c2e-d58b-41af-ab10-3e631d802229@bytedance.com Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Andrea Arcangeli <aarcange(a)redhat.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Jann Horn <jannh(a)google.com> Cc: Pavel Emelyanov <xemul(a)virtuozzo.com> Cc: Qi Zheng <zhengqi.arch(a)bytedance.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/userfaultfd.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) --- a/mm/userfaultfd.c~userfaultfd-fix-checks-for-huge-pmds +++ a/mm/userfaultfd.c @@ -787,21 +787,23 @@ retry: } dst_pmdval = pmdp_get_lockless(dst_pmd); - /* - * If the dst_pmd is mapped as THP don't - * override it and just be strict. - */ - if (unlikely(pmd_trans_huge(dst_pmdval))) { - err = -EEXIST; - break; - } if (unlikely(pmd_none(dst_pmdval)) && unlikely(__pte_alloc(dst_mm, dst_pmd))) { err = -ENOMEM; break; } - /* If an huge pmd materialized from under us fail */ - if (unlikely(pmd_trans_huge(*dst_pmd))) { + dst_pmdval = pmdp_get_lockless(dst_pmd); + /* + * If the dst_pmd is THP don't override it and just be strict. + * (This includes the case where the PMD used to be THP and + * changed back to none after __pte_alloc().) + */ + if (unlikely(!pmd_present(dst_pmdval) || pmd_trans_huge(dst_pmdval) || + pmd_devmap(dst_pmdval))) { + err = -EEXIST; + break; + } + if (unlikely(pmd_bad(dst_pmdval))) { err = -EFAULT; break; } _ Patches currently in -mm which might be from jannh(a)google.com are userfaultfd-fix-checks-for-huge-pmds.patch userfaultfd-dont-bug_on-if-khugepaged-yanks-our-page-table.patch mm-fix-harmless-type-confusion-in-lock_vma_under_rcu.patch

10 months, 4 weeks

1
0
0 0

+ mm-vmalloc-ensure-vmap_block-is-initialised-before-adding-to-queue.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: mm: vmalloc: ensure vmap_block is initialised before adding to queue has been added to the -mm mm-hotfixes-unstable branch. Its filename is mm-vmalloc-ensure-vmap_block-is-initialised-before-adding-to-queue.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Will Deacon <will(a)kernel.org> Subject: mm: vmalloc: ensure vmap_block is initialised before adding to queue Date: Mon, 12 Aug 2024 18:16:06 +0100 Commit 8c61291fd850 ("mm: fix incorrect vbq reference in purge_fragmented_block") extended the 'vmap_block' structure to contain a 'cpu' field which is set at allocation time to the id of the initialising CPU. When a new 'vmap_block' is being instantiated by new_vmap_block(), the partially initialised structure is added to the local 'vmap_block_queue' xarray before the 'cpu' field has been initialised. If another CPU is concurrently walking the xarray (e.g. via vm_unmap_aliases()), then it may perform an out-of-bounds access to the remote queue thanks to an uninitialised index. This has been observed as UBSAN errors in Android: | Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP | | Call trace: | purge_fragmented_block+0x204/0x21c | _vm_unmap_aliases+0x170/0x378 | vm_unmap_aliases+0x1c/0x28 | change_memory_common+0x1dc/0x26c | set_memory_ro+0x18/0x24 | module_enable_ro+0x98/0x238 | do_init_module+0x1b0/0x310 Move the initialisation of 'vb->cpu' in new_vmap_block() ahead of the addition to the xarray. Link: https://lkml.kernel.org/r/20240812171606.17486-1-will@kernel.org Fixes: 8c61291fd850 ("mm: fix incorrect vbq reference in purge_fragmented_block") Signed-off-by: Will Deacon <will(a)kernel.org> Reviewed-by: Baoquan He <bhe(a)redhat.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki(a)gmail.com> Cc: Zhaoyang Huang <zhaoyang.huang(a)unisoc.com> Cc: Hailong.Liu <hailong.liu(a)oppo.com> Cc: Christoph Hellwig <hch(a)infradead.org> Cc: Lorenzo Stoakes <lstoakes(a)gmail.com> Cc: Thomas Gleixner <tglx(a)linutronix.de> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/vmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/mm/vmalloc.c~mm-vmalloc-ensure-vmap_block-is-initialised-before-adding-to-queue +++ a/mm/vmalloc.c @@ -2626,6 +2626,7 @@ static void *new_vmap_block(unsigned int vb->dirty_max = 0; bitmap_set(vb->used_map, 0, (1UL << order)); INIT_LIST_HEAD(&vb->free_list); + vb->cpu = raw_smp_processor_id(); xa = addr_to_vb_xa(va->va_start); vb_idx = addr_to_vb_idx(va->va_start); @@ -2642,7 +2643,6 @@ static void *new_vmap_block(unsigned int * integrity together with list_for_each_rcu from read * side. */ - vb->cpu = raw_smp_processor_id(); vbq = per_cpu_ptr(&vmap_block_queue, vb->cpu); spin_lock(&vbq->lock); list_add_tail_rcu(&vb->free_list, &vbq->free); _ Patches currently in -mm which might be from will(a)kernel.org are mm-vmalloc-ensure-vmap_block-is-initialised-before-adding-to-queue.patch

10 months, 4 weeks

1
0
0 0

+ alloc_tag-mark-pages-reserved-during-cma-activation-as-not-tagged.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: alloc_tag: mark pages reserved during CMA activation as not tagged has been added to the -mm mm-hotfixes-unstable branch. Its filename is alloc_tag-mark-pages-reserved-during-cma-activation-as-not-tagged.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Suren Baghdasaryan <surenb(a)google.com> Subject: alloc_tag: mark pages reserved during CMA activation as not tagged Date: Tue, 13 Aug 2024 08:07:57 -0700 During CMA activation, pages in CMA area are prepared and then freed without being allocated. This triggers warnings when memory allocation debug config (CONFIG_MEM_ALLOC_PROFILING_DEBUG) is enabled. Fix this by marking these pages not tagged before freeing them. Link: https://lkml.kernel.org/r/20240813150758.855881-2-surenb@google.com Fixes: d224eb0287fb ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Kent Overstreet <kent.overstreet(a)linux.dev> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: Sourav Panda <souravpanda(a)google.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> [6.10] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- mm/mm_init.c | 2 ++ 1 file changed, 2 insertions(+) --- a/mm/mm_init.c~alloc_tag-mark-pages-reserved-during-cma-activation-as-not-tagged +++ a/mm/mm_init.c @@ -2244,6 +2244,8 @@ void __init init_cma_reserved_pageblock( set_pageblock_migratetype(page, MIGRATE_CMA); set_page_refcounted(page); + /* pages were reserved and not allocated */ + clear_page_tag_ref(page); __free_pages(page, pageblock_order); adjust_managed_page_count(page, pageblock_nr_pages); _ Patches currently in -mm which might be from surenb(a)google.com are alloc_tag-introduce-clear_page_tag_ref-helper-function.patch alloc_tag-mark-pages-reserved-during-cma-activation-as-not-tagged.patch

10 months, 4 weeks

1
0
0 0

+ alloc_tag-introduce-clear_page_tag_ref-helper-function.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: alloc_tag: introduce clear_page_tag_ref() helper function has been added to the -mm mm-hotfixes-unstable branch. Its filename is alloc_tag-introduce-clear_page_tag_ref-helper-function.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Suren Baghdasaryan <surenb(a)google.com> Subject: alloc_tag: introduce clear_page_tag_ref() helper function Date: Tue, 13 Aug 2024 08:07:56 -0700 In several cases we are freeing pages which were not allocated using common page allocators. For such cases, in order to keep allocation accounting correct, we should clear the page tag to indicate that the page being freed is expected to not have a valid allocation tag. Introduce clear_page_tag_ref() helper function to be used for this. Link: https://lkml.kernel.org/r/20240813150758.855881-1-surenb@google.com Fixes: d224eb0287fb ("codetag: debug: mark codetags for reserved pages as empty") Signed-off-by: Suren Baghdasaryan <surenb(a)google.com> Suggested-by: David Hildenbrand <david(a)redhat.com> Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Kees Cook <keescook(a)chromium.org> Cc: Kent Overstreet <kent.overstreet(a)linux.dev> Cc: Pasha Tatashin <pasha.tatashin(a)soleen.com> Cc: Sourav Panda <souravpanda(a)google.com> Cc: Vlastimil Babka <vbabka(a)suse.cz> Cc: <stable(a)vger.kernel.org> [6.10] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/pgalloc_tag.h | 13 +++++++++++++ mm/mm_init.c | 10 +--------- mm/page_alloc.c | 9 +-------- 3 files changed, 15 insertions(+), 17 deletions(-) --- a/include/linux/pgalloc_tag.h~alloc_tag-introduce-clear_page_tag_ref-helper-function +++ a/include/linux/pgalloc_tag.h @@ -43,6 +43,18 @@ static inline void put_page_tag_ref(unio page_ext_put(page_ext_from_codetag_ref(ref)); } +static inline void clear_page_tag_ref(struct page *page) +{ + if (mem_alloc_profiling_enabled()) { + union codetag_ref *ref = get_page_tag_ref(page); + + if (ref) { + set_codetag_empty(ref); + put_page_tag_ref(ref); + } + } +} + static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, unsigned int nr) { @@ -126,6 +138,7 @@ static inline void pgalloc_tag_sub_pages static inline union codetag_ref *get_page_tag_ref(struct page *page) { return NULL; } static inline void put_page_tag_ref(union codetag_ref *ref) {} +static inline void clear_page_tag_ref(struct page *page) {} static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, unsigned int nr) {} static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} --- a/mm/mm_init.c~alloc_tag-introduce-clear_page_tag_ref-helper-function +++ a/mm/mm_init.c @@ -2459,15 +2459,7 @@ void __init memblock_free_pages(struct p } /* pages were reserved and not allocated */ - if (mem_alloc_profiling_enabled()) { - union codetag_ref *ref = get_page_tag_ref(page); - - if (ref) { - set_codetag_empty(ref); - put_page_tag_ref(ref); - } - } - + clear_page_tag_ref(page); __free_pages_core(page, order, MEMINIT_EARLY); } --- a/mm/page_alloc.c~alloc_tag-introduce-clear_page_tag_ref-helper-function +++ a/mm/page_alloc.c @@ -5815,14 +5815,7 @@ unsigned long free_reserved_area(void *s void free_reserved_page(struct page *page) { - if (mem_alloc_profiling_enabled()) { - union codetag_ref *ref = get_page_tag_ref(page); - - if (ref) { - set_codetag_empty(ref); - put_page_tag_ref(ref); - } - } + clear_page_tag_ref(page); ClearPageReserved(page); init_page_count(page); __free_page(page); _ Patches currently in -mm which might be from surenb(a)google.com are alloc_tag-introduce-clear_page_tag_ref-helper-function.patch alloc_tag-mark-pages-reserved-during-cma-activation-as-not-tagged.patch

10 months, 4 weeks

1
0
0 0

[PATCH 1/2] drm/vmwgfx: Prevent unmapping active read buffers

by Zack Rusin

The kms paths keep a persistent map active to read and compare the cursor buffer. These maps can race with each other in simple scenario where: a) buffer "a" mapped for update b) buffer "a" mapped for compare c) do the compare d) unmap "a" for compare e) update the cursor f) unmap "a" for update At step "e" the buffer has been unmapped and the read contents is bogus. Prevent unmapping of active read buffers by simply keeping a count of how many paths have currently active maps and unmap only when the count reaches 0. Fixes: 485d98d472d5 ("drm/vmwgfx: Add support for CursorMob and CursorBypass 4") Cc: Broadcom internal kernel review list <bcm-kernel-feedback-list(a)broadcom.com> Cc: dri-devel(a)lists.freedesktop.org Cc: <stable(a)vger.kernel.org> # v5.19+ Signed-off-by: Zack Rusin <zack.rusin(a)broadcom.com> --- drivers/gpu/drm/vmwgfx/vmwgfx_bo.c | 13 +++++++++++-- drivers/gpu/drm/vmwgfx/vmwgfx_bo.h | 1 + 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c index f42ebc4a7c22..a0e433fbcba6 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.c @@ -360,6 +360,8 @@ void *vmw_bo_map_and_cache_size(struct vmw_bo *vbo, size_t size) void *virtual; int ret; + atomic_inc(&vbo->map_count); + virtual = ttm_kmap_obj_virtual(&vbo->map, &not_used); if (virtual) return virtual; @@ -383,11 +385,17 @@ void *vmw_bo_map_and_cache_size(struct vmw_bo *vbo, size_t size) */ void vmw_bo_unmap(struct vmw_bo *vbo) { + int map_count; + if (vbo->map.bo == NULL) return; - ttm_bo_kunmap(&vbo->map); - vbo->map.bo = NULL; + map_count = atomic_dec_return(&vbo->map_count); + + if (!map_count) { + ttm_bo_kunmap(&vbo->map); + vbo->map.bo = NULL; + } } @@ -421,6 +429,7 @@ static int vmw_bo_init(struct vmw_private *dev_priv, vmw_bo->tbo.priority = 3; vmw_bo->res_tree = RB_ROOT; xa_init(&vmw_bo->detached_resources); + atomic_set(&vmw_bo->map_count, 0); params->size = ALIGN(params->size, PAGE_SIZE); drm_gem_private_object_init(vdev, &vmw_bo->tbo.base, params->size); diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h index 62b4342d5f7c..dc13f1e996c1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_bo.h @@ -90,6 +90,7 @@ struct vmw_bo { u32 res_prios[TTM_MAX_BO_PRIORITY]; struct xarray detached_resources; + atomic_t map_count; atomic_t cpu_writers; /* Not ref-counted. Protected by binding_mutex */ struct vmw_resource *dx_query_ctx; -- 2.43.0

10 months, 4 weeks

2
5
0 0

[PATCH v2 2/2] drm/xe/display: Make display suspend/resume work on discrete

by Maarten Lankhorst

We should unpin before evicting all memory, and repin after GT resume. This way, we preserve the contents of the framebuffers, and won't hang on resume due to migration engine not being restored yet. Signed-off-by: Maarten Lankhorst <maarten.lankhorst(a)linux.intel.com> Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: <stable(a)vger.kernel.org> # v6.8+ --- drivers/gpu/drm/xe/display/xe_display.c | 23 +++++++++++++++++++++++ drivers/gpu/drm/xe/xe_pm.c | 11 ++++++----- 2 files changed, 29 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index d544d18ad1ecc..4b9ce1f34f4c7 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -283,6 +283,27 @@ static bool suspend_to_idle(void) return false; } +static void xe_display_flush_cleanup_work(struct xe_device *xe) +{ + struct intel_crtc *crtc; + + for_each_intel_crtc(&xe->drm, crtc) { + struct drm_crtc_commit *commit; + + spin_lock(&crtc->base.commit_lock); + commit = list_first_entry_or_null(&crtc->base.commit_list, + struct drm_crtc_commit, commit_entry); + if (commit) + drm_crtc_commit_get(commit); + spin_unlock(&crtc->base.commit_lock); + + if (commit) { + wait_for_completion(&commit->cleanup_done); + drm_crtc_commit_put(commit); + } + } +} + void xe_display_pm_suspend(struct xe_device *xe, bool runtime) { bool s2idle = suspend_to_idle(); @@ -303,6 +324,8 @@ void xe_display_pm_suspend(struct xe_device *xe, bool runtime) if (!runtime) intel_display_driver_suspend(xe); + xe_display_flush_cleanup_work(xe); + intel_dp_mst_suspend(xe); intel_hpd_cancel_work(xe); diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 9f3c14fd9f337..fcfb49af8c891 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -93,13 +93,13 @@ int xe_pm_suspend(struct xe_device *xe) for_each_gt(gt, xe, id) xe_gt_suspend_prepare(gt); + xe_display_pm_suspend(xe, false); + /* FIXME: Super racey... */ err = xe_bo_evict_all(xe); if (err) goto err; - xe_display_pm_suspend(xe, false); - for_each_gt(gt, xe, id) { err = xe_gt_suspend(gt); if (err) { @@ -154,11 +154,11 @@ int xe_pm_resume(struct xe_device *xe) xe_irq_resume(xe); - xe_display_pm_resume(xe, false); - for_each_gt(gt, xe, id) xe_gt_resume(gt); + xe_display_pm_resume(xe, false); + err = xe_bo_restore_user(xe); if (err) goto err; @@ -367,10 +367,11 @@ int xe_pm_runtime_suspend(struct xe_device *xe) mutex_unlock(&xe->mem_access.vram_userfault.lock); if (xe->d3cold.allowed) { + xe_display_pm_suspend(xe, true); + err = xe_bo_evict_all(xe); if (err) goto out; - xe_display_pm_suspend(xe, true); } for_each_gt(gt, xe, id) { -- 2.45.2

10 months, 4 weeks

2
1
0 0

[PATCH v2] perf/bpf: Don't call bpf_overflow_handler() for tracing events

by Joe Damato

From: Kyle Huey <me(a)kylehuey.com> The regressing commit is new in 6.10. It assumed that anytime event->prog is set bpf_overflow_handler() should be invoked to execute the attached bpf program. This assumption is false for tracing events, and as a result the regressing commit broke bpftrace by invoking the bpf handler with garbage inputs on overflow. Prior to the regression the overflow handlers formed a chain (of length 0, 1, or 2) and perf_event_set_bpf_handler() (the !tracing case) added bpf_overflow_handler() to that chain, while perf_event_attach_bpf_prog() (the tracing case) did not. Both set event->prog. The chain of overflow handlers was replaced by a single overflow handler slot and a fixed call to bpf_overflow_handler() when appropriate. This modifies the condition there to check event->prog->type == BPF_PROG_TYPE_PERF_EVENT, restoring the previous behavior and fixing bpftrace. Signed-off-by: Kyle Huey <khuey(a)kylehuey.com> Suggested-by: Andrii Nakryiko <andrii.nakryiko(a)gmail.com> Reported-by: Joe Damato <jdamato(a)fastly.com> Closes: https://lore.kernel.org/lkml/ZpFfocvyF3KHaSzF@LQ3V64L9R2/ Fixes: f11f10bfa1ca ("perf/bpf: Call BPF handler directly, not through overflow machinery") Cc: stable(a)vger.kernel.org Tested-by: Joe Damato <jdamato(a)fastly.com> # bpftrace --- v2: - Update patch based on Andrii's suggestion - Update commit message kernel/events/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index aa3450bdc227..c973e3c11e03 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9706,7 +9706,8 @@ static int __perf_event_overflow(struct perf_event *event, ret = __perf_event_account_interrupt(event, throttle); - if (event->prog && !bpf_overflow_handler(event, data, regs)) + if (event->prog && event->prog->type == BPF_PROG_TYPE_PERF_EVENT && + !bpf_overflow_handler(event, data, regs)) return ret; /* -- 2.25.1

10 months, 4 weeks

3
2
0 0

FAILED: patch "[PATCH] drm/i915/gem: Fix Virtual Memory mapping boundaries" failed to apply to 4.19-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 4.19-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.19.y git checkout FETCH_HEAD git cherry-pick -x 8bdd9ef7e9b1b2a73e394712b72b22055e0e26c3 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024081222-process-suspect-d983@gregkh' --subject-prefix 'PATCH 4.19.y' HEAD^.. Possible dependencies: 8bdd9ef7e9b1 ("drm/i915/gem: Fix Virtual Memory mapping boundaries calculation") 8e4ee5e87ce6 ("drm/i915: Wrap all access to i915_vma.node.start|size") 3bb6a44251b4 ("drm/i915: Rename ggtt_view as gtt_view") d976521a995a ("drm/i915: extend i915_vma_pin_iomap()") d63ddca7c581 ("drm/i915: Update tiled blits selftest") a0ed9c95cce6 ("drm/i915/gt: Use XY_FAST_COLOR_BLT to clear obj on graphics ver 12+") fd5803e5eebe ("drm/i915/gt: use engine instance directly for offset") 892bfb8a604d ("drm/i915/fbdev: fixup setting screen_size") c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions") 30b9d1b3ef37 ("drm/i915: add I915_BO_ALLOC_GPU_ONLY") 3312a4ac8a46 ("drm/i915/ttm: require mappable by default") 8fbf28934acf ("drm/i915/ttm: fixup the mock_bo") db927686e43f ("Merge drm/drm-next into drm-intel-gt-next") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 8bdd9ef7e9b1b2a73e394712b72b22055e0e26c3 Mon Sep 17 00:00:00 2001 From: Andi Shyti <andi.shyti(a)linux.intel.com> Date: Fri, 2 Aug 2024 10:38:50 +0200 Subject: [PATCH] drm/i915/gem: Fix Virtual Memory mapping boundaries calculation Calculating the size of the mapped area as the lesser value between the requested size and the actual size does not consider the partial mapping offset. This can cause page fault access. Fix the calculation of the starting and ending addresses, the total size is now deduced from the difference between the end and start addresses. Additionally, the calculations have been rewritten in a clearer and more understandable form. Fixes: c58305af1835 ("drm/i915: Use remap_io_mapping() to prefault all PTE in a single pass") Reported-by: Jann Horn <jannh(a)google.com> Co-developed-by: Chris Wilson <chris.p.wilson(a)linux.intel.com> Signed-off-by: Chris Wilson <chris.p.wilson(a)linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti(a)linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Cc: Matthew Auld <matthew.auld(a)intel.com> Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com> Cc: <stable(a)vger.kernel.org> # v4.9+ Reviewed-by: Jann Horn <jannh(a)google.com> Reviewed-by: Jonathan Cavitt <Jonathan.cavitt(a)intel.com> [Joonas: Add Requires: tag] Requires: 60a2066c5005 ("drm/i915/gem: Adjust vma offset for framebuffer mmap offset") Signed-off-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240802083850.103694-3-andi.… (cherry picked from commit 97b6784753da06d9d40232328efc5c5367e53417) Signed-off-by: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com> diff --git a/drivers/gpu/drm/i915/gem/i915_gem_mman.c b/drivers/gpu/drm/i915/gem/i915_gem_mman.c index ce10dd259812..cac6d4184506 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -290,6 +290,41 @@ static vm_fault_t vm_fault_cpu(struct vm_fault *vmf) return i915_error_to_vmf_fault(err); } +static void set_address_limits(struct vm_area_struct *area, + struct i915_vma *vma, + unsigned long obj_offset, + unsigned long *start_vaddr, + unsigned long *end_vaddr) +{ + unsigned long vm_start, vm_end, vma_size; /* user's memory parameters */ + long start, end; /* memory boundaries */ + + /* + * Let's move into the ">> PAGE_SHIFT" + * domain to be sure not to lose bits + */ + vm_start = area->vm_start >> PAGE_SHIFT; + vm_end = area->vm_end >> PAGE_SHIFT; + vma_size = vma->size >> PAGE_SHIFT; + + /* + * Calculate the memory boundaries by considering the offset + * provided by the user during memory mapping and the offset + * provided for the partial mapping. + */ + start = vm_start; + start -= obj_offset; + start += vma->gtt_view.partial.offset; + end = start + vma_size; + + start = max_t(long, start, vm_start); + end = min_t(long, end, vm_end); + + /* Let's move back into the "<< PAGE_SHIFT" domain */ + *start_vaddr = (unsigned long)start << PAGE_SHIFT; + *end_vaddr = (unsigned long)end << PAGE_SHIFT; +} + static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) { #define MIN_CHUNK_PAGES (SZ_1M >> PAGE_SHIFT) @@ -302,14 +337,18 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) struct i915_ggtt *ggtt = to_gt(i915)->ggtt; bool write = area->vm_flags & VM_WRITE; struct i915_gem_ww_ctx ww; + unsigned long obj_offset; + unsigned long start, end; /* memory boundaries */ intel_wakeref_t wakeref; struct i915_vma *vma; pgoff_t page_offset; + unsigned long pfn; int srcu; int ret; - /* We don't use vmf->pgoff since that has the fake offset */ + obj_offset = area->vm_pgoff - drm_vma_node_start(&mmo->vma_node); page_offset = (vmf->address - area->vm_start) >> PAGE_SHIFT; + page_offset += obj_offset; trace_i915_gem_object_fault(obj, page_offset, true, write); @@ -402,12 +441,14 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf) if (ret) goto err_unpin; + set_address_limits(area, vma, obj_offset, &start, &end); + + pfn = (ggtt->gmadr.start + i915_ggtt_offset(vma)) >> PAGE_SHIFT; + pfn += (start - area->vm_start) >> PAGE_SHIFT; + pfn += obj_offset - vma->gtt_view.partial.offset; + /* Finally, remap it using the new GTT offset */ - ret = remap_io_mapping(area, - area->vm_start + (vma->gtt_view.partial.offset << PAGE_SHIFT), - (ggtt->gmadr.start + i915_ggtt_offset(vma)) >> PAGE_SHIFT, - min_t(u64, vma->size, area->vm_end - area->vm_start), - &ggtt->iomap); + ret = remap_io_mapping(area, start, pfn, end - start, &ggtt->iomap); if (ret) goto err_fence;

10 months, 4 weeks

3
6
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2024