July 2024 - Linux-stable-mirror

FAILED: patch "[PATCH] mm/hugetlb: fix kernel NULL pointer dereference when" failed to apply to 6.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y git checkout FETCH_HEAD git cherry-pick -x 1390a3334a48ecac5175865fd433d55eec255db8 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072954-overhung-citrus-9daa@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. Possible dependencies: 1390a3334a48 ("mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 1390a3334a48ecac5175865fd433d55eec255db8 Mon Sep 17 00:00:00 2001 From: Miaohe Lin <linmiaohe(a)huawei.com> Date: Tue, 9 Jul 2024 20:04:33 +0800 Subject: [PATCH] mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio A kernel crash was observed when migrating hugetlb folio: BUG: kernel NULL pointer dereference, address: 0000000000000008 PGD 0 P4D 0 Oops: Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 3435 Comm: bash Not tainted 6.10.0-rc6-00450-g8578ca01f21f #66 RIP: 0010:__folio_undo_large_rmappable+0x70/0xb0 RSP: 0018:ffffb165c98a7b38 EFLAGS: 00000097 RAX: fffffbbc44528090 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffffa30e000a2800 RSI: 0000000000000246 RDI: ffffa3153ffffcc0 RBP: fffffbbc44528000 R08: 0000000000002371 R09: ffffffffbe4e5868 R10: 0000000000000001 R11: 0000000000000001 R12: ffffa3153ffffcc0 R13: fffffbbc44468000 R14: 0000000000000001 R15: 0000000000000001 FS: 00007f5b3a716740(0000) GS:ffffa3151fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 000000010959a000 CR4: 00000000000006f0 Call Trace: <TASK> __folio_migrate_mapping+0x59e/0x950 __migrate_folio.constprop.0+0x5f/0x120 move_to_new_folio+0xfd/0x250 migrate_pages+0x383/0xd70 soft_offline_page+0x2ab/0x7f0 soft_offline_page_store+0x52/0x90 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5b3a514887 RSP: 002b:00007ffe138fce68 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007f5b3a514887 RDX: 000000000000000c RSI: 0000556ab809ee10 RDI: 0000000000000001 RBP: 0000556ab809ee10 R08: 00007f5b3a5d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007f5b3a61b780 R14: 00007f5b3a617600 R15: 00007f5b3a616a00 It's because hugetlb folio is passed to __folio_undo_large_rmappable() unexpectedly. large_rmappable flag is imperceptibly set to hugetlb folio since commit f6a8dd98a2ce ("hugetlb: convert alloc_buddy_hugetlb_folio to use a folio"). Then commit be9581ea8c05 ("mm: fix crashes from deferred split racing folio migration") makes folio_migrate_mapping() call folio_undo_large_rmappable() triggering the bug. Fix this issue by clearing large_rmappable flag for hugetlb folios. They don't need that flag set anyway. Link: https://lkml.kernel.org/r/20240709120433.4136700-1-linmiaohe@huawei.com Fixes: f6a8dd98a2ce ("hugetlb: convert alloc_buddy_hugetlb_folio to use a folio") Fixes: be9581ea8c05 ("mm: fix crashes from deferred split racing folio migration") Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com> Cc: Hugh Dickins <hughd(a)google.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 80a93b69652f..11f25e00a293 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2166,6 +2166,9 @@ static struct folio *alloc_buddy_hugetlb_folio(struct hstate *h, nid = numa_mem_id(); retry: folio = __folio_alloc(gfp_mask, order, nid, nmask); + /* Ensure hugetlb folio won't have large_rmappable flag set. */ + if (folio) + folio_clear_large_rmappable(folio); if (folio && !folio_ref_freeze(folio, 1)) { folio_put(folio);

11 months, 1 week

1
0
0 0

FAILED: patch "[PATCH] mm: fix khugepaged activation policy" failed to apply to 6.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y git checkout FETCH_HEAD git cherry-pick -x 00f58104202c472e487f0866fbd38832523fd4f9 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072942-compare-unworried-8aec@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^.. Possible dependencies: 00f58104202c ("mm: fix khugepaged activation policy") 7f83bf14603e ("mm/huge_memory: mark racy access onhuge_anon_orders_always") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 00f58104202c472e487f0866fbd38832523fd4f9 Mon Sep 17 00:00:00 2001 From: Ryan Roberts <ryan.roberts(a)arm.com> Date: Thu, 4 Jul 2024 10:10:50 +0100 Subject: [PATCH] mm: fix khugepaged activation policy Since the introduction of mTHP, the docuementation has stated that khugepaged would be enabled when any mTHP size is enabled, and disabled when all mTHP sizes are disabled. There are 2 problems with this; 1. this is not what was implemented by the code and 2. this is not the desirable behavior. Desirable behavior is for khugepaged to be enabled when any PMD-sized THP is enabled, anon or file. (Note that file THP is still controlled by the top-level control so we must always consider that, as well as the PMD-size mTHP control for anon). khugepaged only supports collapsing to PMD-sized THP so there is no value in enabling it when PMD-sized THP is disabled. So let's change the code and documentation to reflect this policy. Further, per-size enabled control modification events were not previously forwarded to khugepaged to give it an opportunity to start or stop. Consequently the following was resulting in khugepaged eroneously not being activated: echo never > /sys/kernel/mm/transparent_hugepage/enabled echo always > /sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled [ryan.roberts(a)arm.com: v3] Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20240705102849.2479686-1-ryan.roberts@arm.com Link: https://lkml.kernel.org/r/20240704091051.2411934-1-ryan.roberts@arm.com Signed-off-by: Ryan Roberts <ryan.roberts(a)arm.com> Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") Closes: https://lore.kernel.org/linux-mm/7a0bbe69-1e3d-4263-b206-da007791a5c4@redha… Acked-by: David Hildenbrand <david(a)redhat.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <baohua(a)kernel.org> Cc: Jonathan Corbet <corbet(a)lwn.net> Cc: Lance Yang <ioworker0(a)gmail.com> Cc: Yang Shi <shy828301(a)gmail.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index a1bc9b24e29a..fe237825b95c 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -202,12 +202,11 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size -khugepaged will be automatically started when one or more hugepage -sizes are enabled (either by directly setting "always" or "madvise", -or by setting "inherit" while the top-level enabled is set to "always" -or "madvise"), and it'll be automatically shutdown when the last -hugepage size is disabled (either by directly setting "never", or by -setting "inherit" while the top-level enabled is set to "never"). +khugepaged will be automatically started when PMD-sized THP is enabled +(either of the per-size anon control or the top-level control are set +to "always" or "madvise"), and it'll be automatically shutdown when +PMD-sized THP is disabled (when both the per-size anon control and the +top-level control are "never") Khugepaged controls ------------------- diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cee3c5da8f0e..acb6ac24a07e 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -128,18 +128,6 @@ static inline bool hugepage_global_always(void) (1<<TRANSPARENT_HUGEPAGE_FLAG); } -static inline bool hugepage_flags_enabled(void) -{ - /* - * We cover both the anon and the file-backed case here; we must return - * true if globally enabled, even when all anon sizes are set to never. - * So we don't need to look at huge_anon_orders_inherit. - */ - return hugepage_global_enabled() || - READ_ONCE(huge_anon_orders_always) || - READ_ONCE(huge_anon_orders_madvise); -} - static inline int highest_order(unsigned long orders) { return fls_long(orders) - 1; diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 17fb072a0ca1..f8b5cbd4dd71 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -502,6 +502,13 @@ static ssize_t thpsize_enabled_store(struct kobject *kobj, } else ret = -EINVAL; + if (ret > 0) { + int err; + + err = start_stop_khugepaged(); + if (err) + ret = err; + } return ret; } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 409f67a817f1..a5ec03ef8722 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -413,6 +413,26 @@ static inline int hpage_collapse_test_exit_or_disable(struct mm_struct *mm) test_bit(MMF_DISABLE_THP, &mm->flags); } +static bool hugepage_pmd_enabled(void) +{ + /* + * We cover both the anon and the file-backed case here; file-backed + * hugepages, when configured in, are determined by the global control. + * Anon pmd-sized hugepages are determined by the pmd-size control. + */ + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + hugepage_global_enabled()) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_always)) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_madvise)) + return true; + if (test_bit(PMD_ORDER, &huge_anon_orders_inherit) && + hugepage_global_enabled()) + return true; + return false; +} + void __khugepaged_enter(struct mm_struct *mm) { struct khugepaged_mm_slot *mm_slot; @@ -449,7 +469,7 @@ void khugepaged_enter_vma(struct vm_area_struct *vma, unsigned long vm_flags) { if (!test_bit(MMF_VM_HUGEPAGE, &vma->vm_mm->flags) && - hugepage_flags_enabled()) { + hugepage_pmd_enabled()) { if (thp_vma_allowable_order(vma, vm_flags, TVA_ENFORCE_SYSFS, PMD_ORDER)) __khugepaged_enter(vma->vm_mm); @@ -2462,8 +2482,7 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, static int khugepaged_has_work(void) { - return !list_empty(&khugepaged_scan.mm_head) && - hugepage_flags_enabled(); + return !list_empty(&khugepaged_scan.mm_head) && hugepage_pmd_enabled(); } static int khugepaged_wait_event(void) @@ -2536,7 +2555,7 @@ static void khugepaged_wait_work(void) return; } - if (hugepage_flags_enabled()) + if (hugepage_pmd_enabled()) wait_event_freezable(khugepaged_wait, khugepaged_wait_event()); } @@ -2567,7 +2586,7 @@ static void set_recommended_min_free_kbytes(void) int nr_zones = 0; unsigned long recommended_min; - if (!hugepage_flags_enabled()) { + if (!hugepage_pmd_enabled()) { calculate_min_free_kbytes(); goto update_wmarks; } @@ -2617,7 +2636,7 @@ int start_stop_khugepaged(void) int err = 0; mutex_lock(&khugepaged_mutex); - if (hugepage_flags_enabled()) { + if (hugepage_pmd_enabled()) { if (!khugepaged_thread) khugepaged_thread = kthread_run(khugepaged, NULL, "khugepaged"); @@ -2643,7 +2662,7 @@ int start_stop_khugepaged(void) void khugepaged_min_free_kbytes_update(void) { mutex_lock(&khugepaged_mutex); - if (hugepage_flags_enabled() && khugepaged_thread) + if (hugepage_pmd_enabled() && khugepaged_thread) set_recommended_min_free_kbytes(); mutex_unlock(&khugepaged_mutex); }

11 months, 1 week

1
0
0 0

FAILED: patch "[PATCH] mm/huge_memory: avoid PMD-size page cache if needed" failed to apply to 6.1-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.1-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.1.y git checkout FETCH_HEAD git cherry-pick -x d659b715e94ac039803d7601505d3473393fc0be # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072952-envision-salvation-6c3a@gregkh' --subject-prefix 'PATCH 6.1.y' HEAD^.. Possible dependencies: d659b715e94a ("mm/huge_memory: avoid PMD-size page cache if needed") e0ffb29bc54d ("mm: simplify thp_vma_allowable_order") 19eaf44954df ("mm: thp: support allocation of anonymous multi-size THP") 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") 7a81751fcdeb ("mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"") 5003a2bdf688 ("mm: call update_mmu_cache_range() in more page fault handling paths") daa60ae64c65 ("mm,thp: fix smaps THPeligible output alignment") 3db82b9374ca ("mm/memory: allow pte_offset_map[_lock]() to fail") 3b65f437d9e8 ("mm: fix failure to unmap pte on highmem systems") 2bad466cc9d9 ("mm/uffd: UFFD_FEATURE_WP_UNPOPULATED") 3c556d2425b0 ("mm/thp: rename TRANSPARENT_HUGEPAGE_NEVER_DAX to _UNSUPPORTED") 28d41a486331 ("mm: convert wp_page_copy() to use folios") cb3184deef10 ("mm: convert do_anonymous_page() to use a folio") 6bc56a4d8553 ("mm: add vma_alloc_zeroed_movable_folio()") 2cf1338454a8 ("mm: fix khugepaged with shmem_enabled=advise") d1751118c886 ("mm/uffd: detect pgtable allocation failures") a79390f5d6a7 ("mm/mprotect: use long for page accountings and retval") 1ef488edd6c4 ("mm/mprotect: drop pgprot_t parameter from change_protection()") 931298e103c2 ("mm/userfaultfd: rely on vma->vm_page_prot in uffd_wp_range()") fed15f1345dc ("mm/hugetlb: pre-allocate pgtable pages for uffd wr-protects") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From d659b715e94ac039803d7601505d3473393fc0be Mon Sep 17 00:00:00 2001 From: Gavin Shan <gshan(a)redhat.com> Date: Mon, 15 Jul 2024 10:04:23 +1000 Subject: [PATCH] mm/huge_memory: avoid PMD-size page cache if needed xarray can't support arbitrary page cache size. the largest and supported page cache size is defined as MAX_PAGECACHE_ORDER by commit 099d90642a71 ("mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray"). However, it's possible to have 512MB page cache in the huge memory's collapsing path on ARM64 system whose base page size is 64KB. 512MB page cache is breaking the limitation and a warning is raised when the xarray entry is split as shown in the following example. [root@dhcp-10-26-1-207 ~]# cat /proc/1/smaps | grep KernelPageSize KernelPageSize: 64 kB [root@dhcp-10-26-1-207 ~]# cat /tmp/test.c : int main(int argc, char **argv) { const char *filename = TEST_XFS_FILENAME; int fd = 0; void *buf = (void *)-1, *p; int pgsize = getpagesize(); int ret = 0; if (pgsize != 0x10000) { fprintf(stdout, "System with 64KB base page size is required!\n"); return -EPERM; } system("echo 0 > /sys/devices/virtual/bdi/253:0/read_ahead_kb"); system("echo 1 > /proc/sys/vm/drop_caches"); /* Open the xfs file */ fd = open(filename, O_RDONLY); assert(fd > 0); /* Create VMA */ buf = mmap(NULL, TEST_MEM_SIZE, PROT_READ, MAP_SHARED, fd, 0); assert(buf != (void *)-1); fprintf(stdout, "mapped buffer at 0x%p\n", buf); /* Populate VMA */ ret = madvise(buf, TEST_MEM_SIZE, MADV_NOHUGEPAGE); assert(ret == 0); ret = madvise(buf, TEST_MEM_SIZE, MADV_POPULATE_READ); assert(ret == 0); /* Collapse VMA */ ret = madvise(buf, TEST_MEM_SIZE, MADV_HUGEPAGE); assert(ret == 0); ret = madvise(buf, TEST_MEM_SIZE, MADV_COLLAPSE); if (ret) { fprintf(stdout, "Error %d to madvise(MADV_COLLAPSE)\n", errno); goto out; } /* Split xarray entry. Write permission is needed */ munmap(buf, TEST_MEM_SIZE); buf = (void *)-1; close(fd); fd = open(filename, O_RDWR); assert(fd > 0); fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, TEST_MEM_SIZE - pgsize, pgsize); out: if (buf != (void *)-1) munmap(buf, TEST_MEM_SIZE); if (fd > 0) close(fd); return ret; } [root@dhcp-10-26-1-207 ~]# gcc /tmp/test.c -o /tmp/test [root@dhcp-10-26-1-207 ~]# /tmp/test ------------[ cut here ]------------ WARNING: CPU: 25 PID: 7560 at lib/xarray.c:1025 xas_split_alloc+0xf8/0x128 Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib \ nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct \ nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 \ ip_set rfkill nf_tables nfnetlink vfat fat virtio_balloon drm fuse \ xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 virtio_net \ sha1_ce net_failover virtio_blk virtio_console failover dimlib virtio_mmio CPU: 25 PID: 7560 Comm: test Kdump: loaded Not tainted 6.10.0-rc7-gavin+ #9 Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024 pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : xas_split_alloc+0xf8/0x128 lr : split_huge_page_to_list_to_order+0x1c4/0x780 sp : ffff8000ac32f660 x29: ffff8000ac32f660 x28: ffff0000e0969eb0 x27: ffff8000ac32f6c0 x26: 0000000000000c40 x25: ffff0000e0969eb0 x24: 000000000000000d x23: ffff8000ac32f6c0 x22: ffffffdfc0700000 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffdfc0700000 x18: 0000000000000000 x17: 0000000000000000 x16: ffffd5f3708ffc70 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: ffffffffffffffc0 x10: 0000000000000040 x9 : ffffd5f3708e692c x8 : 0000000000000003 x7 : 0000000000000000 x6 : ffff0000e0969eb8 x5 : ffffd5f37289e378 x4 : 0000000000000000 x3 : 0000000000000c40 x2 : 000000000000000d x1 : 000000000000000c x0 : 0000000000000000 Call trace: xas_split_alloc+0xf8/0x128 split_huge_page_to_list_to_order+0x1c4/0x780 truncate_inode_partial_folio+0xdc/0x160 truncate_inode_pages_range+0x1b4/0x4a8 truncate_pagecache_range+0x84/0xa0 xfs_flush_unmap_range+0x70/0x90 [xfs] xfs_file_fallocate+0xfc/0x4d8 [xfs] vfs_fallocate+0x124/0x2f0 ksys_fallocate+0x4c/0xa0 __arm64_sys_fallocate+0x24/0x38 invoke_syscall.constprop.0+0x7c/0xd8 do_el0_svc+0xb4/0xd0 el0_svc+0x44/0x1d8 el0t_64_sync_handler+0x134/0x150 el0t_64_sync+0x17c/0x180 Fix it by correcting the supported page cache orders, different sets for DAX and other files. With it corrected, 512MB page cache becomes disallowed on all non-DAX files on ARM64 system where the base page size is 64KB. After this patch is applied, the test program fails with error -EINVAL returned from __thp_vma_allowable_orders() and the madvise() system call to collapse the page caches. Link: https://lkml.kernel.org/r/20240715000423.316491-1-gshan@redhat.com Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Signed-off-by: Gavin Shan <gshan(a)redhat.com> Acked-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com> Acked-by: Zi Yan <ziy(a)nvidia.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <baohua(a)kernel.org> Cc: Don Dutile <ddutile(a)redhat.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Peter Xu <peterx(a)redhat.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: William Kucharski <william.kucharski(a)oracle.com> Cc: <stable(a)vger.kernel.org> [5.17+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cff002be83eb..e25d9ebfdf89 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -74,14 +74,20 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr; #define THP_ORDERS_ALL_ANON ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1))) /* - * Mask of all large folio orders supported for file THP. + * Mask of all large folio orders supported for file THP. Folios in a DAX + * file is never split and the MAX_PAGECACHE_ORDER limit does not apply to + * it. */ -#define THP_ORDERS_ALL_FILE (BIT(PMD_ORDER) | BIT(PUD_ORDER)) +#define THP_ORDERS_ALL_FILE_DAX \ + (BIT(PMD_ORDER) | BIT(PUD_ORDER)) +#define THP_ORDERS_ALL_FILE_DEFAULT \ + ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) /* * Mask of all large folio orders supported for THP. */ -#define THP_ORDERS_ALL (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE) +#define THP_ORDERS_ALL \ + (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DAX | THP_ORDERS_ALL_FILE_DEFAULT) #define TVA_SMAPS (1 << 0) /* Will be used for procfs */ #define TVA_IN_PF (1 << 1) /* Page fault handler */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 04b72912035d..f4be468e06a4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -89,9 +89,17 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, bool smaps = tva_flags & TVA_SMAPS; bool in_pf = tva_flags & TVA_IN_PF; bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS; + unsigned long supported_orders; + /* Check the intersection of requested and supported orders. */ - orders &= vma_is_anonymous(vma) ? - THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; + if (vma_is_anonymous(vma)) + supported_orders = THP_ORDERS_ALL_ANON; + else if (vma_is_dax(vma)) + supported_orders = THP_ORDERS_ALL_FILE_DAX; + else + supported_orders = THP_ORDERS_ALL_FILE_DEFAULT; + + orders &= supported_orders; if (!orders) return 0;

11 months, 1 week

1
0
0 0

FAILED: patch "[PATCH] mm/huge_memory: avoid PMD-size page cache if needed" failed to apply to 6.6-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.6-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y git checkout FETCH_HEAD git cherry-pick -x d659b715e94ac039803d7601505d3473393fc0be # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024072950-factsheet-cabbie-418a@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^.. Possible dependencies: d659b715e94a ("mm/huge_memory: avoid PMD-size page cache if needed") e0ffb29bc54d ("mm: simplify thp_vma_allowable_order") 19eaf44954df ("mm: thp: support allocation of anonymous multi-size THP") 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") 7a81751fcdeb ("mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From d659b715e94ac039803d7601505d3473393fc0be Mon Sep 17 00:00:00 2001 From: Gavin Shan <gshan(a)redhat.com> Date: Mon, 15 Jul 2024 10:04:23 +1000 Subject: [PATCH] mm/huge_memory: avoid PMD-size page cache if needed xarray can't support arbitrary page cache size. the largest and supported page cache size is defined as MAX_PAGECACHE_ORDER by commit 099d90642a71 ("mm/filemap: make MAX_PAGECACHE_ORDER acceptable to xarray"). However, it's possible to have 512MB page cache in the huge memory's collapsing path on ARM64 system whose base page size is 64KB. 512MB page cache is breaking the limitation and a warning is raised when the xarray entry is split as shown in the following example. [root@dhcp-10-26-1-207 ~]# cat /proc/1/smaps | grep KernelPageSize KernelPageSize: 64 kB [root@dhcp-10-26-1-207 ~]# cat /tmp/test.c : int main(int argc, char **argv) { const char *filename = TEST_XFS_FILENAME; int fd = 0; void *buf = (void *)-1, *p; int pgsize = getpagesize(); int ret = 0; if (pgsize != 0x10000) { fprintf(stdout, "System with 64KB base page size is required!\n"); return -EPERM; } system("echo 0 > /sys/devices/virtual/bdi/253:0/read_ahead_kb"); system("echo 1 > /proc/sys/vm/drop_caches"); /* Open the xfs file */ fd = open(filename, O_RDONLY); assert(fd > 0); /* Create VMA */ buf = mmap(NULL, TEST_MEM_SIZE, PROT_READ, MAP_SHARED, fd, 0); assert(buf != (void *)-1); fprintf(stdout, "mapped buffer at 0x%p\n", buf); /* Populate VMA */ ret = madvise(buf, TEST_MEM_SIZE, MADV_NOHUGEPAGE); assert(ret == 0); ret = madvise(buf, TEST_MEM_SIZE, MADV_POPULATE_READ); assert(ret == 0); /* Collapse VMA */ ret = madvise(buf, TEST_MEM_SIZE, MADV_HUGEPAGE); assert(ret == 0); ret = madvise(buf, TEST_MEM_SIZE, MADV_COLLAPSE); if (ret) { fprintf(stdout, "Error %d to madvise(MADV_COLLAPSE)\n", errno); goto out; } /* Split xarray entry. Write permission is needed */ munmap(buf, TEST_MEM_SIZE); buf = (void *)-1; close(fd); fd = open(filename, O_RDWR); assert(fd > 0); fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, TEST_MEM_SIZE - pgsize, pgsize); out: if (buf != (void *)-1) munmap(buf, TEST_MEM_SIZE); if (fd > 0) close(fd); return ret; } [root@dhcp-10-26-1-207 ~]# gcc /tmp/test.c -o /tmp/test [root@dhcp-10-26-1-207 ~]# /tmp/test ------------[ cut here ]------------ WARNING: CPU: 25 PID: 7560 at lib/xarray.c:1025 xas_split_alloc+0xf8/0x128 Modules linked in: nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib \ nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct \ nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 \ ip_set rfkill nf_tables nfnetlink vfat fat virtio_balloon drm fuse \ xfs libcrc32c crct10dif_ce ghash_ce sha2_ce sha256_arm64 virtio_net \ sha1_ce net_failover virtio_blk virtio_console failover dimlib virtio_mmio CPU: 25 PID: 7560 Comm: test Kdump: loaded Not tainted 6.10.0-rc7-gavin+ #9 Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-1.el9 05/24/2024 pstate: 83400005 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : xas_split_alloc+0xf8/0x128 lr : split_huge_page_to_list_to_order+0x1c4/0x780 sp : ffff8000ac32f660 x29: ffff8000ac32f660 x28: ffff0000e0969eb0 x27: ffff8000ac32f6c0 x26: 0000000000000c40 x25: ffff0000e0969eb0 x24: 000000000000000d x23: ffff8000ac32f6c0 x22: ffffffdfc0700000 x21: 0000000000000000 x20: 0000000000000000 x19: ffffffdfc0700000 x18: 0000000000000000 x17: 0000000000000000 x16: ffffd5f3708ffc70 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: ffffffffffffffc0 x10: 0000000000000040 x9 : ffffd5f3708e692c x8 : 0000000000000003 x7 : 0000000000000000 x6 : ffff0000e0969eb8 x5 : ffffd5f37289e378 x4 : 0000000000000000 x3 : 0000000000000c40 x2 : 000000000000000d x1 : 000000000000000c x0 : 0000000000000000 Call trace: xas_split_alloc+0xf8/0x128 split_huge_page_to_list_to_order+0x1c4/0x780 truncate_inode_partial_folio+0xdc/0x160 truncate_inode_pages_range+0x1b4/0x4a8 truncate_pagecache_range+0x84/0xa0 xfs_flush_unmap_range+0x70/0x90 [xfs] xfs_file_fallocate+0xfc/0x4d8 [xfs] vfs_fallocate+0x124/0x2f0 ksys_fallocate+0x4c/0xa0 __arm64_sys_fallocate+0x24/0x38 invoke_syscall.constprop.0+0x7c/0xd8 do_el0_svc+0xb4/0xd0 el0_svc+0x44/0x1d8 el0t_64_sync_handler+0x134/0x150 el0t_64_sync+0x17c/0x180 Fix it by correcting the supported page cache orders, different sets for DAX and other files. With it corrected, 512MB page cache becomes disallowed on all non-DAX files on ARM64 system where the base page size is 64KB. After this patch is applied, the test program fails with error -EINVAL returned from __thp_vma_allowable_orders() and the madvise() system call to collapse the page caches. Link: https://lkml.kernel.org/r/20240715000423.316491-1-gshan@redhat.com Fixes: 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") Signed-off-by: Gavin Shan <gshan(a)redhat.com> Acked-by: David Hildenbrand <david(a)redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts(a)arm.com> Acked-by: Zi Yan <ziy(a)nvidia.com> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Cc: Barry Song <baohua(a)kernel.org> Cc: Don Dutile <ddutile(a)redhat.com> Cc: Matthew Wilcox (Oracle) <willy(a)infradead.org> Cc: Peter Xu <peterx(a)redhat.com> Cc: Ryan Roberts <ryan.roberts(a)arm.com> Cc: William Kucharski <william.kucharski(a)oracle.com> Cc: <stable(a)vger.kernel.org> [5.17+] Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cff002be83eb..e25d9ebfdf89 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -74,14 +74,20 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr; #define THP_ORDERS_ALL_ANON ((BIT(PMD_ORDER + 1) - 1) & ~(BIT(0) | BIT(1))) /* - * Mask of all large folio orders supported for file THP. + * Mask of all large folio orders supported for file THP. Folios in a DAX + * file is never split and the MAX_PAGECACHE_ORDER limit does not apply to + * it. */ -#define THP_ORDERS_ALL_FILE (BIT(PMD_ORDER) | BIT(PUD_ORDER)) +#define THP_ORDERS_ALL_FILE_DAX \ + (BIT(PMD_ORDER) | BIT(PUD_ORDER)) +#define THP_ORDERS_ALL_FILE_DEFAULT \ + ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) /* * Mask of all large folio orders supported for THP. */ -#define THP_ORDERS_ALL (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE) +#define THP_ORDERS_ALL \ + (THP_ORDERS_ALL_ANON | THP_ORDERS_ALL_FILE_DAX | THP_ORDERS_ALL_FILE_DEFAULT) #define TVA_SMAPS (1 << 0) /* Will be used for procfs */ #define TVA_IN_PF (1 << 1) /* Page fault handler */ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 04b72912035d..f4be468e06a4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -89,9 +89,17 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, bool smaps = tva_flags & TVA_SMAPS; bool in_pf = tva_flags & TVA_IN_PF; bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS; + unsigned long supported_orders; + /* Check the intersection of requested and supported orders. */ - orders &= vma_is_anonymous(vma) ? - THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; + if (vma_is_anonymous(vma)) + supported_orders = THP_ORDERS_ALL_ANON; + else if (vma_is_dax(vma)) + supported_orders = THP_ORDERS_ALL_FILE_DAX; + else + supported_orders = THP_ORDERS_ALL_FILE_DEFAULT; + + orders &= supported_orders; if (!orders) return 0;

11 months, 1 week

1
0
0 0

Linux 6.9.12

by Greg Kroah-Hartman

I'm announcing the release of the 6.9.12 kernel. All users of the 6.9 kernel series must upgrade. The updated 6.9.y git tree can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-6.9.y and can be browsed at the normal kernel.org git web browser: https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary thanks, greg k-h ------------ Makefile | 2 - arch/arm64/boot/dts/qcom/ipq6018.dtsi | 1 arch/arm64/boot/dts/qcom/ipq8074.dtsi | 2 + arch/arm64/boot/dts/qcom/msm8996.dtsi | 1 arch/arm64/boot/dts/qcom/msm8998.dtsi | 1 arch/arm64/boot/dts/qcom/qrb2210-rb1.dts | 13 +++++++- arch/arm64/boot/dts/qcom/qrb4210-rb2.dts | 13 +++++++- arch/arm64/boot/dts/qcom/sc7180.dtsi | 1 arch/arm64/boot/dts/qcom/sc7280.dtsi | 1 arch/arm64/boot/dts/qcom/sdm630.dtsi | 1 arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 + arch/arm64/boot/dts/qcom/sm6115.dtsi | 1 arch/arm64/boot/dts/qcom/sm6350.dtsi | 1 arch/arm64/boot/dts/qcom/x1e80100-crd.dts | 17 ++++++++--- arch/arm64/boot/dts/qcom/x1e80100-qcp.dts | 17 ++++++++--- arch/s390/mm/fault.c | 3 + drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 2 - drivers/net/tap.c | 5 +++ drivers/net/tun.c | 3 + drivers/usb/gadget/function/f_midi2.c | 19 +++++++----- fs/jfs/xattr.c | 23 ++++++++++++--- fs/locks.c | 9 ++--- fs/ntfs3/fslog.c | 44 ++++++++++++++++++++++++---- fs/ocfs2/dir.c | 46 ++++++++++++++++++------------ sound/core/pcm_dmaengine.c | 6 +++ sound/core/seq/seq_ump_client.c | 16 ++++++++++ sound/pci/hda/patch_realtek.c | 3 + 27 files changed, 198 insertions(+), 55 deletions(-) Abel Vesa (4): arm64: dts: qcom: x1e80100-qcp: Fix USB PHYs regulators arm64: dts: qcom: x1e80100-crd: Fix the PHY regulator for PCIe 6a arm64: dts: qcom: x1e80100-qcp: Fix the PHY regulator for PCIe 6a arm64: dts: qcom: x1e80100-crd: Fix USB PHYs regulators Dan Carpenter (1): drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq() Dmitry Baryshkov (2): arm64: dts: qcom: qrb2210-rb1: switch I2C2 to i2c-gpio arm64: dts: qcom: qrb4210-rb2: switch I2C2 to i2c-gpio Dongli Zhang (1): tun: add missing verification for short frame Edson Juliano Drosdeck (1): ALSA: hda/realtek: Enable headset mic on Positivo SU C1400 Gerald Schaefer (1): s390/mm: Fix VM_FAULT_HWPOISON handling in do_exception() Greg Kroah-Hartman (1): Linux 6.9.12 Jann Horn (1): filelock: Fix fcntl/close race recovery compat path Konstantin Komarov (1): fs/ntfs3: Add a check for attr_names and oatbl Krishna Kurapati (10): arm64: dts: qcom: sc7180: Disable SuperSpeed instances in park mode arm64: dts: qcom: sc7280: Disable SuperSpeed instances in park mode arm64: dts: qcom: msm8996: Disable SS instance in Parkmode for USB arm64: dts: qcom: sm6350: Disable SS instance in Parkmode for USB arm64: dts: qcom: msm8998: Disable SS instance in Parkmode for USB arm64: dts: qcom: ipq6018: Disable SS instance in Parkmode for USB arm64: dts: qcom: sdm630: Disable SS instance in Parkmode for USB arm64: dts: qcom: ipq8074: Disable SS instance in Parkmode for USB arm64: dts: qcom: sdm845: Disable SS instance in Parkmode for USB arm64: dts: qcom: sm6115: Disable SS instance in Parkmode for USB Seunghun Han (1): ALSA: hda/realtek: Fix the speaker output on Samsung Galaxy Book Pro 360 Shenghao Ding (1): ALSA: hda/tas2781: Add new quirk for Lenovo Hera2 Laptop Shengjiu Wang (1): ALSA: pcm_dmaengine: Don't synchronize DMA channel when DMA is paused Si-Wei Liu (1): tap: add missing verification for short frame Takashi Iwai (2): usb: gadget: midi2: Fix incorrect default MIDI2 protocol setup ALSA: seq: ump: Skip useless ports for static blocks lei lu (3): ocfs2: add bounds checking to ocfs2_check_dir_entry() jfs: don't walk off the end of ealist fs/ntfs3: Validate ff offset

11 months, 1 week

1
2
0 0

[PATCH v1 2/2] mm/hugetlb: fix hugetlb vs. core-mm PT locking

by David Hildenbrand

We recently made GUP's common page table walking code to also walk hugetlb VMAs without most hugetlb special-casing, preparing for the future of having less hugetlb-specific page table walking code in the codebase. Turns out that we missed one page table locking detail: page table locking for hugetlb folios that are not mapped using a single PMD/PUD. Assume we have hugetlb folio that spans multiple PTEs (e.g., 64 KiB hugetlb folios on arm64 with 4 KiB base page size). GUP, as it walks the page tables, will perform a pte_offset_map_lock() to grab the PTE table lock. However, hugetlb that concurrently modifies these page tables would actually grab the mm->page_table_lock: with USE_SPLIT_PTE_PTLOCKS, the locks would differ. Something similar can happen right now with hugetlb folios that span multiple PMDs when USE_SPLIT_PMD_PTLOCKS. Let's make huge_pte_lockptr() effectively uses the same PT locks as any core-mm page table walker would. There is one ugly case: powerpc 8xx, whereby we have an 8 MiB hugetlb folio being mapped using two PTE page tables. While hugetlb wants to take the PMD table lock, core-mm would grab the PTE table lock of one of both PTE page tables. In such corner cases, we have to make sure that both locks match, which is (fortunately!) currently guaranteed for 8xx as it does not support SMP. Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code") Cc: <stable(a)vger.kernel.org> Signed-off-by: David Hildenbrand <david(a)redhat.com> --- include/linux/hugetlb.h | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c9bf68c239a01..da800e56fe590 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -944,10 +944,29 @@ static inline bool htlb_allow_alloc_fallback(int reason) static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { - if (huge_page_size(h) == PMD_SIZE) + VM_WARN_ON(huge_page_size(h) == PAGE_SIZE); + VM_WARN_ON(huge_page_size(h) >= P4D_SIZE); + + /* + * hugetlb must use the exact same PT locks as core-mm page table + * walkers would. When modifying a PTE table, hugetlb must take the + * PTE PT lock, when modifying a PMD table, hugetlb must take the PMD + * PT lock etc. + * + * The expectation is that any hugetlb folio smaller than a PMD is + * always mapped into a single PTE table and that any hugetlb folio + * smaller than a PUD (but at least as big as a PMD) is always mapped + * into a single PMD table. + * + * If that does not hold for an architecture, then that architecture + * must disable split PT locks such that all *_lockptr() functions + * will give us the same result: the per-MM PT lock. + */ + if (huge_page_size(h) < PMD_SIZE) + return pte_lockptr(mm, pte); + else if (huge_page_size(h) < PUD_SIZE) return pmd_lockptr(mm, (pmd_t *) pte); - VM_BUG_ON(huge_page_size(h) == PAGE_SIZE); - return &mm->page_table_lock; + return pud_lockptr(mm, (pud_t *) pte); } #ifndef hugepages_supported -- 2.45.2

11 months, 1 week

5
11
0 0

[PATCH v3] arm64: dts: qcom: sa8775p: Mark APPS and PCIe SMMUs as DMA coherent

by Qingqing Zhou

The SMMUs on sa8775p are cache-coherent. GPU SMMU is marked as such, mark the APPS and PCIe ones as well. Fixes: 603f96d4c9d0 ("arm64: dts: qcom: add initial support for qcom sa8775p-ride") Fixes: 2dba7a613a6e ("arm64: dts: qcom: sa8775p: add the pcie smmu node") Cc: stable(a)vger.kernel.org Reviewed-by: Konrad Dybcio <konrad.dybcio(a)linaro.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> Signed-off-by: Qingqing Zhou <quic_qqzhou(a)quicinc.com> --- v2 -> v3: - Remove the line break between tags. - Add the Cc stable tag. Link to v2: https://lore.kernel.org/all/20240723075948.9545-1-quic_qqzhou@quicinc.com/ v1 -> v2: - Add the Fixes tags. - Update the commit message. Link to v1: https://lore.kernel.org/lkml/20240715071649.25738-1-quic_qqzhou@quicinc.com/ --- arch/arm64/boot/dts/qcom/sa8775p.dtsi | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/boot/dts/qcom/sa8775p.dtsi b/arch/arm64/boot/dts/qcom/sa8775p.dtsi index 23f1b2e5e624..95691ab58a23 100644 --- a/arch/arm64/boot/dts/qcom/sa8775p.dtsi +++ b/arch/arm64/boot/dts/qcom/sa8775p.dtsi @@ -3070,6 +3070,7 @@ reg = <0x0 0x15000000 0x0 0x100000>; #iommu-cells = <2>; #global-interrupts = <2>; + dma-coherent; interrupts = <GIC_SPI 119 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>, @@ -3208,6 +3209,7 @@ reg = <0x0 0x15200000 0x0 0x80000>; #iommu-cells = <2>; #global-interrupts = <2>; + dma-coherent; interrupts = <GIC_SPI 920 IRQ_TYPE_LEVEL_HIGH>, <GIC_SPI 921 IRQ_TYPE_LEVEL_HIGH>, -- 2.17.1

11 months, 1 week

2
1
0 0

[PATCH v2] soc: qcom: cmd-db: Map shared memory as WC, not WB

by Maulik Shah

From: Volodymyr Babchuk <Volodymyr_Babchuk(a)epam.com> Linux does not write into cmd-db region. This region of memory is write protected by XPU. XPU may sometime falsely detect clean cache eviction as "write" into the write protected region leading to secure interrupt which causes an endless loop somewhere in Trust Zone. The only reason it is working right now is because Qualcomm Hypervisor maps the same region as Non-Cacheable memory in Stage 2 translation tables. The issue manifests if we want to use another hypervisor (like Xen or KVM), which does not know anything about those specific mappings. Changing the mapping of cmd-db memory from MEMREMAP_WB to MEMREMAP_WT/WC removes dependency on correct mappings in Stage 2 tables. This patch fixes the issue by updating the mapping to MEMREMAP_WC. I tested this on SA8155P with Xen. Fixes: 312416d9171a ("drivers: qcom: add command DB driver") Cc: stable(a)vger.kernel.org # 5.4+ Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk(a)epam.com> Tested-by: Nikita Travkin <nikita(a)trvn.ru> # sc7180 WoA in EL2 Signed-off-by: Maulik Shah <quic_mkshah(a)quicinc.com> --- Changes in v2: - Use MEMREMAP_WC instead of MEMREMAP_WT - Update commit message from comments in v1 - Add Fixes tag and Cc to stable - Link to v1: https://lore.kernel.org/lkml/20240327200917.2576034-1-volodymyr_babchuk@epa… --- drivers/soc/qcom/cmd-db.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/soc/qcom/cmd-db.c b/drivers/soc/qcom/cmd-db.c index d84572662017..ae66c2623d25 100644 --- a/drivers/soc/qcom/cmd-db.c +++ b/drivers/soc/qcom/cmd-db.c @@ -349,7 +349,7 @@ static int cmd_db_dev_probe(struct platform_device *pdev) return -EINVAL; } - cmd_db_header = memremap(rmem->base, rmem->size, MEMREMAP_WB); + cmd_db_header = memremap(rmem->base, rmem->size, MEMREMAP_WC); if (!cmd_db_header) { ret = -ENOMEM; cmd_db_header = NULL; --- base-commit: 797012914d2d031430268fe512af0ccd7d8e46ef change-id: 20240718-cmd_db_uncached-e896da5c5296 Best regards, -- Maulik Shah <quic_mkshah(a)quicinc.com>

11 months, 1 week

4
7
0 0

[PATCH] clk: qcom: gcc-sm8450: Do not turn off PCIe GDSCs during gdsc_disable()

by Manivannan Sadhasivam

With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This can happen during scenarios such as system suspend and breaks the resume of PCIe controllers from suspend. So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs during gdsc_disable() and allow the hardware to transition the GDSCs to retention when the parent domain enters low power state during system suspend. Cc: stable(a)vger.kernel.org # 5.17 Fixes: db0c944ee92b ("clk: qcom: Add clock driver for SM8450") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> --- drivers/clk/qcom/gcc-sm8450.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/clk/qcom/gcc-sm8450.c b/drivers/clk/qcom/gcc-sm8450.c index 639a9a955914..c445c271678a 100644 --- a/drivers/clk/qcom/gcc-sm8450.c +++ b/drivers/clk/qcom/gcc-sm8450.c @@ -2974,7 +2974,7 @@ static struct gdsc pcie_0_gdsc = { .pd = { .name = "pcie_0_gdsc", }, - .pwrsts = PWRSTS_OFF_ON, + .pwrsts = PWRSTS_RET_ON, }; static struct gdsc pcie_1_gdsc = { @@ -2982,7 +2982,7 @@ static struct gdsc pcie_1_gdsc = { .pd = { .name = "pcie_1_gdsc", }, - .pwrsts = PWRSTS_OFF_ON, + .pwrsts = PWRSTS_RET_ON, }; static struct gdsc ufs_phy_gdsc = { -- 2.25.1

11 months, 1 week

2
1
0 0

[PATCH] clk: qcom: gcc-sm8250: Do not turn off PCIe GDSCs during gdsc_disable()

by Manivannan Sadhasivam

With PWRSTS_OFF_ON, PCIe GDSCs are turned off during gdsc_disable(). This can happen during scenarios such as system suspend and breaks the resume of PCIe controllers from suspend. So use PWRSTS_RET_ON to indicate the GDSC driver to not turn off the GDSCs during gdsc_disable() and allow the hardware to transition the GDSCs to retention when the parent domain enters low power state during system suspend. Cc: stable(a)vger.kernel.org # 5.7 Fixes: 3e5770921a88 ("clk: qcom: gcc: Add global clock controller driver for SM8250") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org> --- drivers/clk/qcom/gcc-sm8250.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/clk/qcom/gcc-sm8250.c b/drivers/clk/qcom/gcc-sm8250.c index 991cd8b8d597..1c59d70e0f96 100644 --- a/drivers/clk/qcom/gcc-sm8250.c +++ b/drivers/clk/qcom/gcc-sm8250.c @@ -3226,7 +3226,7 @@ static struct gdsc pcie_0_gdsc = { .pd = { .name = "pcie_0_gdsc", }, - .pwrsts = PWRSTS_OFF_ON, + .pwrsts = PWRSTS_RET_ON, }; static struct gdsc pcie_1_gdsc = { @@ -3234,7 +3234,7 @@ static struct gdsc pcie_1_gdsc = { .pd = { .name = "pcie_1_gdsc", }, - .pwrsts = PWRSTS_OFF_ON, + .pwrsts = PWRSTS_RET_ON, }; static struct gdsc pcie_2_gdsc = { @@ -3242,7 +3242,7 @@ static struct gdsc pcie_2_gdsc = { .pd = { .name = "pcie_2_gdsc", }, - .pwrsts = PWRSTS_OFF_ON, + .pwrsts = PWRSTS_RET_ON, }; static struct gdsc ufs_card_gdsc = { -- 2.25.1

11 months, 1 week

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024