July 2024 - Linux-stable-mirror

[PATCH 1/3] arm64: dts: mediatek: mt8195-cherry: Mark USB 3.0 on xhci1 as disabled

by Chen-Yu Tsai

USB 3.0 on xhci1 is not used, as the controller shares the same PHY as pcie1. The latter is enabled to support the M.2 PCIe WLAN card on this design. Mark USB 3.0 as disabled on this controller using the "mediatek,u3p-dis-msk" property. Reported-by: Nícolas F. R. A. Prado <nfraprado(a)collabora.com> #KernelCI Closes: https://lore.kernel.org/all/9fce9838-ef87-4d1b-b3df-63e1ddb0ec51@notapiano/ Fixes: b6267a396e1c ("arm64: dts: mediatek: cherry: Enable T-PHYs and USB XHCI controllers") Cc: <stable(a)vger.kernel.org> Signed-off-by: Chen-Yu Tsai <wenst(a)chromium.org> --- arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi index fe5400e17b0f..d3a52acbe48a 100644 --- a/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi +++ b/arch/arm64/boot/dts/mediatek/mt8195-cherry.dtsi @@ -1404,6 +1404,7 @@ &xhci1 { rx-fifo-depth = <3072>; vusb33-supply = <&mt6359_vusb_ldo_reg>; vbus-supply = <&usb_vbus>; + mediatek,u3p-dis-msk = <1>; }; &xhci2 { -- 2.46.0.rc1.232.g9752f9e123-goog

11 months, 2 weeks

1
0
0 0

+ kcov-properly-check-for-softirq-context.patch added to mm-hotfixes-unstable branch

by Andrew Morton

The patch titled Subject: kcov: properly check for softirq context has been added to the -mm mm-hotfixes-unstable branch. Its filename is kcov-properly-check-for-softirq-context.patch This patch will shortly appear at https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche… This patch will later appear in the mm-hotfixes-unstable branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Before you just go and hit "reply", please: a) Consider who else should be cc'ed b) Prefer to cc a suitable mailing list as well c) Ideally: find the original patch on the mailing list and do a reply-to-all to that, adding suitable additional cc's *** Remember to use Documentation/process/submit-checklist.rst when testing your code *** The -mm tree is included into linux-next via the mm-everything branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm and is updated there every 2-3 working days ------------------------------------------------------ From: Andrey Konovalov <andreyknvl(a)gmail.com> Subject: kcov: properly check for softirq context Date: Mon, 29 Jul 2024 04:21:58 +0200 When collecting coverage from softirqs, KCOV uses in_serving_softirq() to check whether the code is running in the softirq context. Unfortunately, in_serving_softirq() is > 0 even when the code is running in the hardirq or NMI context for hardirqs and NMIs that happened during a softirq. As a result, if a softirq handler contains a remote coverage collection section and a hardirq with another remote coverage collection section happens during handling the softirq, KCOV incorrectly detects a nested softirq coverate collection section and prints a WARNING, as reported by syzbot. This issue was exposed by commit a7f3813e589f ("usb: gadget: dummy_hcd: Switch to hrtimer transfer scheduler"), which switched dummy_hcd to using hrtimer and made the timer's callback be executed in the hardirq context. Change the related checks in KCOV to account for this behavior of in_serving_softirq() and make KCOV ignore remote coverage collection sections in the hardirq and NMI contexts. This prevents the WARNING printed by syzbot but does not fix the inability of KCOV to collect coverage from the __usb_hcd_giveback_urb when dummy_hcd is in use (caused by a7f3813e589f); a separate patch is required for that. Link: https://lkml.kernel.org/r/20240729022158.92059-1-andrey.konovalov@linux.dev Fixes: 5ff3b30ab57d ("kcov: collect coverage from interrupts") Signed-off-by: Andrey Konovalov <andreyknvl(a)gmail.com> Reported-by: syzbot+2388cdaeb6b10f0c13ac(a)syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2388cdaeb6b10f0c13ac Acked-by: Marco Elver <elver(a)google.com> Cc: Alan Stern <stern(a)rowland.harvard.edu> Cc: Aleksandr Nogikh <nogikh(a)google.com> Cc: Alexander Potapenko <glider(a)google.com> Cc: Dmitry Vyukov <dvyukov(a)google.com> Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Cc: Marcello Sylvester Bauer <sylv(a)sylv.io> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- kernel/kcov.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) --- a/kernel/kcov.c~kcov-properly-check-for-softirq-context +++ a/kernel/kcov.c @@ -161,6 +161,15 @@ static void kcov_remote_area_put(struct kmsan_unpoison_memory(&area->list, sizeof(area->list)); } +/* + * Unlike in_serving_softirq(), this function returns false when called during + * a hardirq or an NMI that happened in the softirq context. + */ +static inline bool in_softirq_really(void) +{ + return in_serving_softirq() && !in_hardirq() && !in_nmi(); +} + static notrace bool check_kcov_mode(enum kcov_mode needed_mode, struct task_struct *t) { unsigned int mode; @@ -170,7 +179,7 @@ static notrace bool check_kcov_mode(enum * so we ignore code executed in interrupts, unless we are in a remote * coverage collection section in a softirq. */ - if (!in_task() && !(in_serving_softirq() && t->kcov_softirq)) + if (!in_task() && !(in_softirq_really() && t->kcov_softirq)) return false; mode = READ_ONCE(t->kcov_mode); /* @@ -849,7 +858,7 @@ void kcov_remote_start(u64 handle) if (WARN_ON(!kcov_check_handle(handle, true, true, true))) return; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); @@ -991,7 +1000,7 @@ void kcov_remote_stop(void) int sequence; unsigned long flags; - if (!in_task() && !in_serving_softirq()) + if (!in_task() && !in_softirq_really()) return; local_lock_irqsave(&kcov_percpu_data.lock, flags); _ Patches currently in -mm which might be from andreyknvl(a)gmail.com are kcov-properly-check-for-softirq-context.patch kcov-dont-instrument-lib-find_bitc.patch

11 months, 2 weeks

1
0
0 0

[PATCH v2] mm/hugetlb: fix hugetlb vs. core-mm PT locking

by David Hildenbrand

We recently made GUP's common page table walking code to also walk hugetlb VMAs without most hugetlb special-casing, preparing for the future of having less hugetlb-specific page table walking code in the codebase. Turns out that we missed one page table locking detail: page table locking for hugetlb folios that are not mapped using a single PMD/PUD. Assume we have hugetlb folio that spans multiple PTEs (e.g., 64 KiB hugetlb folios on arm64 with 4 KiB base page size). GUP, as it walks the page tables, will perform a pte_offset_map_lock() to grab the PTE table lock. However, hugetlb that concurrently modifies these page tables would actually grab the mm->page_table_lock: with USE_SPLIT_PTE_PTLOCKS, the locks would differ. Something similar can happen right now with hugetlb folios that span multiple PMDs when USE_SPLIT_PMD_PTLOCKS. This issue can be reproduced [1], for example triggering: [ 3105.936100] ------------[ cut here ]------------ [ 3105.939323] WARNING: CPU: 31 PID: 2732 at mm/gup.c:142 try_grab_folio+0x11c/0x188 [ 3105.944634] Modules linked in: [...] [ 3105.974841] CPU: 31 PID: 2732 Comm: reproducer Not tainted 6.10.0-64.eln141.aarch64 #1 [ 3105.980406] Hardware name: QEMU KVM Virtual Machine, BIOS edk2-20240524-4.fc40 05/24/2024 [ 3105.986185] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 3105.991108] pc : try_grab_folio+0x11c/0x188 [ 3105.994013] lr : follow_page_pte+0xd8/0x430 [ 3105.996986] sp : ffff80008eafb8f0 [ 3105.999346] x29: ffff80008eafb900 x28: ffffffe8d481f380 x27: 00f80001207cff43 [ 3106.004414] x26: 0000000000000001 x25: 0000000000000000 x24: ffff80008eafba48 [ 3106.009520] x23: 0000ffff9372f000 x22: ffff7a54459e2000 x21: ffff7a546c1aa978 [ 3106.014529] x20: ffffffe8d481f3c0 x19: 0000000000610041 x18: 0000000000000001 [ 3106.019506] x17: 0000000000000001 x16: ffffffffffffffff x15: 0000000000000000 [ 3106.024494] x14: ffffb85477fdfe08 x13: 0000ffff9372ffff x12: 0000000000000000 [ 3106.029469] x11: 1fffef4a88a96be1 x10: ffff7a54454b5f0c x9 : ffffb854771b12f0 [ 3106.034324] x8 : 0008000000000000 x7 : ffff7a546c1aa980 x6 : 0008000000000080 [ 3106.038902] x5 : 00000000001207cf x4 : 0000ffff9372f000 x3 : ffffffe8d481f000 [ 3106.043420] x2 : 0000000000610041 x1 : 0000000000000001 x0 : 0000000000000000 [ 3106.047957] Call trace: [ 3106.049522] try_grab_folio+0x11c/0x188 [ 3106.051996] follow_pmd_mask.constprop.0.isra.0+0x150/0x2e0 [ 3106.055527] follow_page_mask+0x1a0/0x2b8 [ 3106.058118] __get_user_pages+0xf0/0x348 [ 3106.060647] faultin_page_range+0xb0/0x360 [ 3106.063651] do_madvise+0x340/0x598 Let's make huge_pte_lockptr() effectively uses the same PT locks as any core-mm page table walker would. Add ptep_lockptr() to obtain the PTE page table lock using a pte pointer -- unfortunately we cannot convert pte_lockptr() because virt_to_page() doesn't work with kmap'ed page tables we can have with CONFIG_HIGHPTE. There is one ugly case: powerpc 8xx, whereby we have an 8 MiB hugetlb folio being mapped using two PTE page tables. While hugetlb wants to take the PMD table lock, core-mm would grab the PTE table lock of one of both PTE page tables. In such corner cases, we have to make sure that both locks match, which is (fortunately!) currently guaranteed for 8xx as it does not support SMP and consequently doesn't use split PT locks. [1] https://lore.kernel.org/all/1bbfcc7f-f222-45a5-ac44-c5a1381c596d@redhat.com/ Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code") Cc: <stable(a)vger.kernel.org> Cc: Peter Xu <peterx(a)redhat.com> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Baolin Wang <baolin.wang(a)linux.alibaba.com> Signed-off-by: David Hildenbrand <david(a)redhat.com> --- Still busy runtime-testing of this version -- have to set up my ARM environment again. Dropped the RB's/ACKs because there was significant change in the pte_lockptr() handling. v1 -> 2: * Extend patch description * Drop "mm: let pte_lockptr() consume a pte_t pointer" * Introduce ptep_lockptr() in this patch I wish there was a nicer way to avoid messing with CONFIG_HIGHPTE ... --- include/linux/hugetlb.h | 26 +++++++++++++++++++++++--- include/linux/mm.h | 10 ++++++++++ 2 files changed, 33 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index c9bf68c239a01..dd6d4ee5ee59c 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -944,10 +944,30 @@ static inline bool htlb_allow_alloc_fallback(int reason) static inline spinlock_t *huge_pte_lockptr(struct hstate *h, struct mm_struct *mm, pte_t *pte) { - if (huge_page_size(h) == PMD_SIZE) + VM_WARN_ON(huge_page_size(h) == PAGE_SIZE); + VM_WARN_ON(huge_page_size(h) >= P4D_SIZE); + + /* + * hugetlb must use the exact same PT locks as core-mm page table + * walkers would. When modifying a PTE table, hugetlb must take the + * PTE PT lock, when modifying a PMD table, hugetlb must take the PMD + * PT lock etc. + * + * The expectation is that any hugetlb folio smaller than a PMD is + * always mapped into a single PTE table and that any hugetlb folio + * smaller than a PUD (but at least as big as a PMD) is always mapped + * into a single PMD table. + * + * If that does not hold for an architecture, then that architecture + * must disable split PT locks such that all *_lockptr() functions + * will give us the same result: the per-MM PT lock. + */ + if (huge_page_size(h) < PMD_SIZE && !IS_ENABLED(CONFIG_HIGHPTE)) + /* pte_alloc_huge() only applies with !CONFIG_HIGHPTE */ + return ptep_lockptr(mm, pte); + else if (huge_page_size(h) < PUD_SIZE) return pmd_lockptr(mm, (pmd_t *) pte); - VM_BUG_ON(huge_page_size(h) == PAGE_SIZE); - return &mm->page_table_lock; + return pud_lockptr(mm, (pud_t *) pte); } #ifndef hugepages_supported diff --git a/include/linux/mm.h b/include/linux/mm.h index b100df8cb5857..1b1f40ff00b7d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2926,6 +2926,12 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) return ptlock_ptr(page_ptdesc(pmd_page(*pmd))); } +static inline spinlock_t *ptep_lockptr(struct mm_struct *mm, pte_t *pte) +{ + BUILD_BUG_ON(IS_ENABLED(CONFIG_HIGHPTE)); + return ptlock_ptr(virt_to_ptdesc(pte)); +} + static inline bool ptlock_init(struct ptdesc *ptdesc) { /* @@ -2950,6 +2956,10 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) { return &mm->page_table_lock; } +static inline spinlock_t *ptep_lockptr(struct mm_struct *mm, pte_t *pte) +{ + return &mm->page_table_lock; +} static inline void ptlock_cache_init(void) {} static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } static inline void ptlock_free(struct ptdesc *ptdesc) {} -- 2.45.2

11 months, 2 weeks

3
6
0 0

[PATCH net V1] net: phy: micrel: Fix the KSZ9131 MDI-X status issue

by Raju Lakkaraju

The MDIX status is not accurately reflecting the current state after the link partner has manually altered its MDIX configuration while operating in forced mode. Access information about Auto mdix completion and pair selection from the KSZ9131's Auto/MDI/MDI-X status register Fixes: b64e6a8794d9 ("net: phy: micrel: Add PHY Auto/MDI/MDI-X set driver for KSZ9131") Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju(a)microchip.com> --- drivers/net/phy/micrel.c | 34 +++++++++++++++++++--------------- 1 file changed, 19 insertions(+), 15 deletions(-) diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index dd519805deee..65b0a3115e14 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1389,6 +1389,8 @@ static int ksz9131_config_init(struct phy_device *phydev) const struct device *dev_walker; int ret; + phydev->mdix_ctrl = ETH_TP_MDI_AUTO; + dev_walker = &phydev->mdio.dev; do { of_node = dev_walker->of_node; @@ -1438,28 +1440,30 @@ static int ksz9131_config_init(struct phy_device *phydev) #define MII_KSZ9131_AUTO_MDIX 0x1C #define MII_KSZ9131_AUTO_MDI_SET BIT(7) #define MII_KSZ9131_AUTO_MDIX_SWAP_OFF BIT(6) +#define MII_KSZ9131_DIG_AXAN_STS 0x14 +#define MII_KSZ9131_DIG_AXAN_STS_LINK_DET BIT(14) +#define MII_KSZ9131_DIG_AXAN_STS_A_SELECT BIT(12) static int ksz9131_mdix_update(struct phy_device *phydev) { int ret; - ret = phy_read(phydev, MII_KSZ9131_AUTO_MDIX); - if (ret < 0) - return ret; - - if (ret & MII_KSZ9131_AUTO_MDIX_SWAP_OFF) { - if (ret & MII_KSZ9131_AUTO_MDI_SET) - phydev->mdix_ctrl = ETH_TP_MDI; - else - phydev->mdix_ctrl = ETH_TP_MDI_X; + if (phydev->mdix_ctrl != ETH_TP_MDI_AUTO) { + phydev->mdix = phydev->mdix_ctrl; } else { - phydev->mdix_ctrl = ETH_TP_MDI_AUTO; - } + ret = phy_read(phydev, MII_KSZ9131_DIG_AXAN_STS); + if (ret < 0) + return ret; - if (ret & MII_KSZ9131_AUTO_MDI_SET) - phydev->mdix = ETH_TP_MDI; - else - phydev->mdix = ETH_TP_MDI_X; + if (ret & MII_KSZ9131_DIG_AXAN_STS_LINK_DET) { + if (ret & MII_KSZ9131_DIG_AXAN_STS_A_SELECT) + phydev->mdix = ETH_TP_MDI; + else + phydev->mdix = ETH_TP_MDI_X; + } else { + phydev->mdix = ETH_TP_MDI_INVALID; + } + } return 0; } -- 2.34.1

11 months, 2 weeks

5
4
0 0

[to-be-updated] mm-let-pte_lockptr-consume-a-pte_t-pointer.patch removed from -mm tree

by Andrew Morton

The quilt patch titled Subject: mm: let pte_lockptr() consume a pte_t pointer has been removed from the -mm tree. Its filename was mm-let-pte_lockptr-consume-a-pte_t-pointer.patch This patch was dropped because an updated version will be issued ------------------------------------------------------ From: David Hildenbrand <david(a)redhat.com> Subject: mm: let pte_lockptr() consume a pte_t pointer Date: Thu, 25 Jul 2024 20:39:54 +0200 Patch series "mm/hugetlb: fix hugetlb vs. core-mm PT locking". Working on another generic page table walker that tries to avoid special-casing hugetlb, I found a page table locking issue with hugetlb folios that are not mapped using a single PMD/PUD. For some hugetlb folio sizes, GUP will take different page table locks when walking the page tables than hugetlb when modifying the page tables. I did not actually try reproducing an issue, but looking at follow_pmd_mask() where we might be rereading a PMD value multiple times it's rather clear that concurrent modifications are rather unpleasant. In follow_page_pte() we might be better in that regard -- ptep_get() does a READ_ONCE() -- but who knows what else could happen concurrently in some weird corner cases (e.g., hugetlb folio getting unmapped and freed). This patch (of 2): pte_lockptr() is the only *_lockptr() function that doesn't consume what would be expected: it consumes a pmd_t pointer instead of a pte_t pointer. Let's change that. The two callers in pgtable-generic.c are easily adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a pte_offset_map_nolock() to obtain the lock, even though we won't actually be traversing the page table. This makes the code more similar to the other variants and avoids other hacks to make the new pte_lockptr() version happy. pte_lockptr() users reside now only in pgtable-generic.c. Maybe, using pte_offset_map_nolock() is the right thing to do because the PTE table could have been removed in the meantime? At least it sounds more future proof if we ever have other means of page table reclaim. It's not quite clear if holding the PTE table lock is really required: what if someone else obtains the lock just after we unlock it? But we'll leave that as is for now, maybe there are good reasons. This is a preparation for adapting hugetlb page table locking logic to take the same locks as core-mm page table walkers would. Link: https://lkml.kernel.org/r/20240725183955.2268884-1-david@redhat.com Link: https://lkml.kernel.org/r/20240725183955.2268884-2-david@redhat.com Fixes: 9cb28da54643 ("mm/gup: handle hugetlb in the generic follow_page_mask code") Signed-off-by: David Hildenbrand <david(a)redhat.com> Cc: Muchun Song <muchun.song(a)linux.dev> Cc: Oscar Salvador <osalvador(a)suse.de> Cc: Peter Xu <peterx(a)redhat.com> Cc: <stable(a)vger.kernel.org> Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org> --- include/linux/mm.h | 7 ++++--- mm/khugepaged.c | 21 +++++++++++++++------ mm/pgtable-generic.c | 4 ++-- 3 files changed, 21 insertions(+), 11 deletions(-) --- a/include/linux/mm.h~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/include/linux/mm.h @@ -2915,9 +2915,10 @@ static inline spinlock_t *ptlock_ptr(str } #endif /* ALLOC_SPLIT_PTLOCKS */ -static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) +static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) { - return ptlock_ptr(page_ptdesc(pmd_page(*pmd))); + /* PTE page tables don't currently exceed a single page. */ + return ptlock_ptr(virt_to_ptdesc(pte)); } static inline bool ptlock_init(struct ptdesc *ptdesc) @@ -2940,7 +2941,7 @@ static inline bool ptlock_init(struct pt /* * We use mm->page_table_lock to guard all pagetable pages of the mm. */ -static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pmd_t *pmd) +static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) { return &mm->page_table_lock; } --- a/mm/khugepaged.c~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/mm/khugepaged.c @@ -1697,12 +1697,13 @@ static void retract_page_tables(struct a i_mmap_lock_read(mapping); vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) { struct mmu_notifier_range range; + bool retracted = false; struct mm_struct *mm; unsigned long addr; pmd_t *pmd, pgt_pmd; spinlock_t *pml; spinlock_t *ptl; - bool skipped_uffd = false; + pte_t *pte; /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1739,9 +1740,17 @@ static void retract_page_tables(struct a mmu_notifier_invalidate_range_start(&range); pml = pmd_lock(mm, pmd); - ptl = pte_lockptr(mm, pmd); + + /* + * No need to check the PTE table content, but we'll grab the + * PTE table lock while we zap it. + */ + pte = pte_offset_map_nolock(mm, pmd, addr, &ptl); + if (!pte) + goto unlock_pmd; if (ptl != pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + pte_unmap(pte); /* * Huge page lock is still held, so normally the page table @@ -1752,20 +1761,20 @@ static void retract_page_tables(struct a * repeating the anon_vma check protects from one category, * and repeating the userfaultfd_wp() check from another. */ - if (unlikely(vma->anon_vma || userfaultfd_wp(vma))) { - skipped_uffd = true; - } else { + if (likely(!vma->anon_vma && !userfaultfd_wp(vma))) { pgt_pmd = pmdp_collapse_flush(vma, addr, pmd); pmdp_get_lockless_sync(); + retracted = true; } if (ptl != pml) spin_unlock(ptl); +unlock_pmd: spin_unlock(pml); mmu_notifier_invalidate_range_end(&range); - if (!skipped_uffd) { + if (retracted) { mm_dec_nr_ptes(mm); page_table_check_pte_clear_range(mm, addr, pgt_pmd); pte_free_defer(mm, pmd_pgtable(pgt_pmd)); --- a/mm/pgtable-generic.c~mm-let-pte_lockptr-consume-a-pte_t-pointer +++ a/mm/pgtable-generic.c @@ -313,7 +313,7 @@ pte_t *pte_offset_map_nolock(struct mm_s pte = __pte_offset_map(pmd, addr, &pmdval); if (likely(pte)) - *ptlp = pte_lockptr(mm, &pmdval); + *ptlp = pte_lockptr(mm, pte); return pte; } @@ -371,7 +371,7 @@ again: pte = __pte_offset_map(pmd, addr, &pmdval); if (unlikely(!pte)) return pte; - ptl = pte_lockptr(mm, &pmdval); + ptl = pte_lockptr(mm, pte); spin_lock(ptl); if (likely(pmd_same(pmdval, pmdp_get_lockless(pmd)))) { *ptlp = ptl; _ Patches currently in -mm which might be from david(a)redhat.com are mm-let-pte_lockptr-consume-a-pte_t-pointer-fix.patch mm-hugetlb-fix-hugetlb-vs-core-mm-pt-locking.patch mm-turn-use_split_pte_ptlocks-use_split_pte_ptlocks-into-kconfig-options.patch mm-hugetlb-enforce-that-pmd-pt-sharing-has-split-pmd-pt-locks.patch powerpc-8xx-document-and-enforce-that-split-pt-locks-are-not-used.patch mm-simplify-arch_make_folio_accessible.patch mm-gup-convert-to-arch_make_folio_accessible.patch s390-uv-drop-arch_make_page_accessible.patch

11 months, 2 weeks

1
0
0 0

[PATCH v4 2/2] Revert "ALSA: firewire-lib: operate for period elapse event in process context"

by Edmund Raile

Commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") removed the process context workqueue from amdtp_domain_stream_pcm_pointer() and update_pcm_pointers() to remove its overhead. With RME Fireface 800, this lead to a regression since Kernels 5.14.0, causing an AB/BA deadlock competition for the substream lock with eventual system freeze under ALSA operation: thread 0: * (lock A) acquire substream lock by snd_pcm_stream_lock_irq() in snd_pcm_status64() * (lock B) wait for tasklet to finish by calling tasklet_unlock_spin_wait() in tasklet_disable_in_atomic() in ohci_flush_iso_completions() of ohci.c thread 1: * (lock B) enter tasklet * (lock A) attempt to acquire substream lock, waiting for it to be released: snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed() in update_pcm_pointers() in process_ctx_payloads() in process_rx_packets() of amdtp-stream.c ? tasklet_unlock_spin_wait </NMI> <TASK> ohci_flush_iso_completions firewire_ohci amdtp_domain_stream_pcm_pointer snd_firewire_lib snd_pcm_update_hw_ptr0 snd_pcm snd_pcm_status64 snd_pcm ? native_queued_spin_lock_slowpath </NMI> <IRQ> _raw_spin_lock_irqsave snd_pcm_period_elapsed snd_pcm process_rx_packets snd_firewire_lib irq_target_callback snd_firewire_lib handle_it_packet firewire_ohci context_tasklet firewire_ohci Restore the process context work queue to prevent deadlock AB/BA deadlock competition for ALSA substream lock of snd_pcm_stream_lock_irq() in snd_pcm_status64() and snd_pcm_stream_lock_irqsave() in snd_pcm_period_elapsed(). revert commit 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") Replace inline description to prevent future deadlock. Cc: stable(a)vger.kernel.org Fixes: 7ba5ca32fe6e ("ALSA: firewire-lib: operate for period elapse event in process context") Reported-by: edmund.raile <edmund.raile(a)proton.me> Closes: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn… Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com> --- sound/firewire/amdtp-stream.c | 23 +++++++++-------------- 1 file changed, 9 insertions(+), 14 deletions(-) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index 31201d506a21..7438999e0510 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -615,16 +615,8 @@ static void update_pcm_pointers(struct amdtp_stream *s, // The program in user process should periodically check the status of intermediate // buffer associated to PCM substream to process PCM frames in the buffer, instead // of receiving notification of period elapsed by poll wait. - if (!pcm->runtime->no_period_wakeup) { - if (in_softirq()) { - // In software IRQ context for 1394 OHCI. - snd_pcm_period_elapsed(pcm); - } else { - // In process context of ALSA PCM application under acquired lock of - // PCM substream. - snd_pcm_period_elapsed_under_stream_lock(pcm); - } - } + if (!pcm->runtime->no_period_wakeup) + queue_work(system_highpri_wq, &s->period_work); } } @@ -1864,11 +1856,14 @@ unsigned long amdtp_domain_stream_pcm_pointer(struct amdtp_domain *d, { struct amdtp_stream *irq_target = d->irq_target; - // Process isochronous packets queued till recent isochronous cycle to handle PCM frames. if (irq_target && amdtp_stream_running(irq_target)) { - // In software IRQ context, the call causes dead-lock to disable the tasklet - // synchronously. - if (!in_softirq()) + // use wq to prevent AB/BA deadlock competition for + // substream lock: + // fw_iso_context_flush_completions() acquires + // lock by ohci_flush_iso_completions(), + // amdtp-stream process_rx_packets() attempts to + // acquire same lock by snd_pcm_elapsed() + if (current_work() != &s->period_work) fw_iso_context_flush_completions(irq_target->context); } -- 2.45.2

11 months, 2 weeks

1
0
0 0

[PATCH v4 1/2] Revert "ALSA: firewire-lib: obsolete workqueue for period update"

by Edmund Raile

prepare resolution of AB/BA deadlock competition for substream lock: restore workqueue previously used for process context: revert commit b5b519965c4c ("ALSA: firewire-lib: obsolete workqueue for period update") Cc: stable(a)vger.kernel.org Link: https://lore.kernel.org/r/kwryofzdmjvzkuw6j3clftsxmoolynljztxqwg76hzeo4simn… Signed-off-by: Edmund Raile <edmund.raile(a)protonmail.com> --- sound/firewire/amdtp-stream.c | 15 +++++++++++++++ sound/firewire/amdtp-stream.h | 1 + 2 files changed, 16 insertions(+) diff --git a/sound/firewire/amdtp-stream.c b/sound/firewire/amdtp-stream.c index d35d0a420ee0..31201d506a21 100644 --- a/sound/firewire/amdtp-stream.c +++ b/sound/firewire/amdtp-stream.c @@ -77,6 +77,8 @@ // overrun. Actual device can skip more, then this module stops the packet streaming. #define IR_JUMBO_PAYLOAD_MAX_SKIP_CYCLES 5 +static void pcm_period_work(struct work_struct *work); + /** * amdtp_stream_init - initialize an AMDTP stream structure * @s: the AMDTP stream to initialize @@ -105,6 +107,7 @@ int amdtp_stream_init(struct amdtp_stream *s, struct fw_unit *unit, s->flags = flags; s->context = ERR_PTR(-1); mutex_init(&s->mutex); + INIT_WORK(&s->period_work, pcm_period_work); s->packet_index = 0; init_waitqueue_head(&s->ready_wait); @@ -347,6 +350,7 @@ EXPORT_SYMBOL(amdtp_stream_get_max_payload); */ void amdtp_stream_pcm_prepare(struct amdtp_stream *s) { + cancel_work_sync(&s->period_work); s->pcm_buffer_pointer = 0; s->pcm_period_pointer = 0; } @@ -624,6 +628,16 @@ static void update_pcm_pointers(struct amdtp_stream *s, } } +static void pcm_period_work(struct work_struct *work) +{ + struct amdtp_stream *s = container_of(work, struct amdtp_stream, + period_work); + struct snd_pcm_substream *pcm = READ_ONCE(s->pcm); + + if (pcm) + snd_pcm_period_elapsed(pcm); +} + static int queue_packet(struct amdtp_stream *s, struct fw_iso_packet *params, bool sched_irq) { @@ -1910,6 +1924,7 @@ static void amdtp_stream_stop(struct amdtp_stream *s) return; } + cancel_work_sync(&s->period_work); fw_iso_context_stop(s->context); fw_iso_context_destroy(s->context); s->context = ERR_PTR(-1); diff --git a/sound/firewire/amdtp-stream.h b/sound/firewire/amdtp-stream.h index a1ed2e80f91a..775db3fc4959 100644 --- a/sound/firewire/amdtp-stream.h +++ b/sound/firewire/amdtp-stream.h @@ -191,6 +191,7 @@ struct amdtp_stream { /* For a PCM substream processing. */ struct snd_pcm_substream *pcm; + struct work_struct period_work; snd_pcm_uframes_t pcm_buffer_pointer; unsigned int pcm_period_pointer; unsigned int pcm_frame_multiplier; -- 2.45.2

11 months, 2 weeks

1
0
0 0

Re: [PATCH 6.10 000/809] 6.10.3-rc1 review

by Ronald Warsow

Hi Greg no regressions here on x86_64 (RKL, Intel 11th Gen. CPU) Thanks Tested-by: Ronald Warsow <rwarsow(a)gmx.de>

11 months, 2 weeks

1
0
0 0

[PATCH v2] fs/netfs/fscache_io: remove the obsolete "using_pgpriv2" flag

by Max Kellermann

This fixes a crash bug caused by commit ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag") by removing a leftover folio_end_private_2() call after all calls to folio_start_private_2() had been removed by the commit. By calling folio_end_private_2() without folio_start_private_2(), the folio refcounter breaks and causes trouble like RCU stalls and general protection faults. Cc: stable(a)vger.kernel.org Fixes: ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag") Link: https://lore.kernel.org/ceph-devel/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=… Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- fs/ceph/addr.c | 2 +- fs/netfs/fscache_io.c | 29 +---------------------------- include/linux/fscache.h | 30 ++++-------------------------- 3 files changed, 6 insertions(+), 55 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 8c16bc5250ef..485cbd1730d1 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -512,7 +512,7 @@ static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, b struct fscache_cookie *cookie = ceph_fscache_cookie(ci); fscache_write_to_cache(cookie, inode->i_mapping, off, len, i_size_read(inode), - ceph_fscache_write_terminated, inode, true, caching); + ceph_fscache_write_terminated, inode, caching); } #else static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, bool caching) diff --git a/fs/netfs/fscache_io.c b/fs/netfs/fscache_io.c index 38637e5c9b57..0d8f3f646598 100644 --- a/fs/netfs/fscache_io.c +++ b/fs/netfs/fscache_io.c @@ -166,30 +166,10 @@ struct fscache_write_request { loff_t start; size_t len; bool set_bits; - bool using_pgpriv2; netfs_io_terminated_t term_func; void *term_func_priv; }; -void __fscache_clear_page_bits(struct address_space *mapping, - loff_t start, size_t len) -{ - pgoff_t first = start / PAGE_SIZE; - pgoff_t last = (start + len - 1) / PAGE_SIZE; - struct page *page; - - if (len) { - XA_STATE(xas, &mapping->i_pages, first); - - rcu_read_lock(); - xas_for_each(&xas, page, last) { - folio_end_private_2(page_folio(page)); - } - rcu_read_unlock(); - } -} -EXPORT_SYMBOL(__fscache_clear_page_bits); - /* * Deal with the completion of writing the data to the cache. */ @@ -198,10 +178,6 @@ static void fscache_wreq_done(void *priv, ssize_t transferred_or_error, { struct fscache_write_request *wreq = priv; - if (wreq->using_pgpriv2) - fscache_clear_page_bits(wreq->mapping, wreq->start, wreq->len, - wreq->set_bits); - if (wreq->term_func) wreq->term_func(wreq->term_func_priv, transferred_or_error, was_async); @@ -214,7 +190,7 @@ void __fscache_write_to_cache(struct fscache_cookie *cookie, loff_t start, size_t len, loff_t i_size, netfs_io_terminated_t term_func, void *term_func_priv, - bool using_pgpriv2, bool cond) + bool cond) { struct fscache_write_request *wreq; struct netfs_cache_resources *cres; @@ -232,7 +208,6 @@ void __fscache_write_to_cache(struct fscache_cookie *cookie, wreq->mapping = mapping; wreq->start = start; wreq->len = len; - wreq->using_pgpriv2 = using_pgpriv2; wreq->set_bits = cond; wreq->term_func = term_func; wreq->term_func_priv = term_func_priv; @@ -260,8 +235,6 @@ void __fscache_write_to_cache(struct fscache_cookie *cookie, abandon_free: kfree(wreq); abandon: - if (using_pgpriv2) - fscache_clear_page_bits(mapping, start, len, cond); if (term_func) term_func(term_func_priv, ret, false); } diff --git a/include/linux/fscache.h b/include/linux/fscache.h index 9de27643607f..f8c52bddaa15 100644 --- a/include/linux/fscache.h +++ b/include/linux/fscache.h @@ -177,8 +177,7 @@ void __fscache_write_to_cache(struct fscache_cookie *cookie, loff_t start, size_t len, loff_t i_size, netfs_io_terminated_t term_func, void *term_func_priv, - bool using_pgpriv2, bool cond); -extern void __fscache_clear_page_bits(struct address_space *, loff_t, size_t); + bool cond); /** * fscache_acquire_volume - Register a volume as desiring caching services @@ -573,24 +572,6 @@ int fscache_write(struct netfs_cache_resources *cres, return ops->write(cres, start_pos, iter, term_func, term_func_priv); } -/** - * fscache_clear_page_bits - Clear the PG_fscache bits from a set of pages - * @mapping: The netfs inode to use as the source - * @start: The start position in @mapping - * @len: The amount of data to unlock - * @caching: If PG_fscache has been set - * - * Clear the PG_fscache flag from a sequence of pages and wake up anyone who's - * waiting. - */ -static inline void fscache_clear_page_bits(struct address_space *mapping, - loff_t start, size_t len, - bool caching) -{ - if (caching) - __fscache_clear_page_bits(mapping, start, len); -} - /** * fscache_write_to_cache - Save a write to the cache and clear PG_fscache * @cookie: The cookie representing the cache object @@ -600,7 +581,6 @@ static inline void fscache_clear_page_bits(struct address_space *mapping, * @i_size: The new size of the inode * @term_func: The function to call upon completion * @term_func_priv: The private data for @term_func - * @using_pgpriv2: If we're using PG_private_2 to mark in-progress write * @caching: If we actually want to do the caching * * Helper function for a netfs to write dirty data from an inode into the cache @@ -612,21 +592,19 @@ static inline void fscache_clear_page_bits(struct address_space *mapping, * marked with PG_fscache. * * If given, @term_func will be called upon completion and supplied with - * @term_func_priv. Note that if @using_pgpriv2 is set, the PG_private_2 flags - * will have been cleared by this point, so the netfs must retain its own pin - * on the mapping. + * @term_func_priv. */ static inline void fscache_write_to_cache(struct fscache_cookie *cookie, struct address_space *mapping, loff_t start, size_t len, loff_t i_size, netfs_io_terminated_t term_func, void *term_func_priv, - bool using_pgpriv2, bool caching) + bool caching) { if (caching) __fscache_write_to_cache(cookie, mapping, start, len, i_size, term_func, term_func_priv, - using_pgpriv2, caching); + caching); else if (term_func) term_func(term_func_priv, -ENOBUFS, false); -- 2.43.0

11 months, 2 weeks

2
2
0 0

[PATCH] fs/ceph/addr: pass using_pgpriv2=false to fscache_write_to_cache()

by Max Kellermann

This piece was missing in commit ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag"). There is one remaining use of PG_private_2: the function __fscache_clear_page_bits(), whose only purpose is to clear PG_private_2. This is done via folio_end_private_2() which also releases the folio reference which was supposed to be taken by folio_start_private_2() (via ceph_set_page_fscache()). __fscache_clear_page_bits() is called by __fscache_write_to_cache(), but only if the parameter using_pgpriv2 is true; the only caller of that function is ceph_fscache_write_to_cache() which still passes true. By calling folio_end_private_2() without folio_start_private_2(), the folio refcounter breaks and causes trouble like RCU stalls and general protection faults. Cc: stable(a)vger.kernel.org Fixes: ae678317b95e ("netfs: Remove deprecated use of PG_private_2 as a second writeback flag") Link: https://lore.kernel.org/ceph-devel/CAKPOu+_DA8XiMAA2ApMj7Pyshve_YWknw8Hdt1=… Signed-off-by: Max Kellermann <max.kellermann(a)ionos.com> --- fs/ceph/addr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 8c16bc5250ef..aacea3e8fd6d 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -512,7 +512,7 @@ static void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, b struct fscache_cookie *cookie = ceph_fscache_cookie(ci); fscache_write_to_cache(cookie, inode->i_mapping, off, len, i_size_read(inode), - ceph_fscache_write_terminated, inode, true, caching); + ceph_fscache_write_terminated, inode, false, caching); } #else static inline void ceph_fscache_write_to_cache(struct inode *inode, u64 off, u64 len, bool caching) -- 2.43.0

11 months, 2 weeks

2
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror July 2024