- Linux-stable-mirror - lists.linaro.org

[PATCH] KVM: x86: Add x2APIC "features" to control EOI broadcast suppression

by Khushit Shah

Add two flags for KVM_CAP_X2APIC_API to allow userspace to control support for Suppress EOI Broadcasts, which KVM completely mishandles. When x2APIC support was first added, KVM incorrectly advertised and "enabled" Suppress EOI Broadcast, without fully supporting the I/O APIC side of the equation, i.e. without adding directed EOI to KVM's in-kernel I/O APIC. That flaw was carried over to split IRQCHIP support, i.e. KVM advertised support for Suppress EOI Broadcasts irrespective of whether or not the userspace I/O APIC implementation supported directed EOIs. Even worse, KVM didn't actually suppress EOI broadcasts, i.e. userspace VMMs without support for directed EOI came to rely on the "spurious" broadcasts. KVM "fixed" the in-kernel I/O APIC implementation by completely disabling support for Supress EOI Broadcasts in commit 0bcc3fb95b97 ("KVM: lapic: stop advertising DIRECTED_EOI when in-kernel IOAPIC is in use"), but didn't do anything to remedy userspace I/O APIC implementations. KVM's bogus handling of Supress EOI Broad is problematic when the guest relies on interrupts being masked in the I/O APIC until well after the initial local APIC EOI. E.g. Windows with Credential Guard enabled handles interrupts in the following order:` 1. Interrupt for L2 arrives. 2. L1 APIC EOIs the interrupt. 3. L1 resumes L2 and injects the interrupt. 4. L2 EOIs after servicing. 5. L1 performs the I/O APIC EOI. Because KVM EOIs the I/O APIC at step #2, the guest can get an interrupt storm, e.g. if the IRQ line is still asserted and userspace reacts to the EOI by re-injecting the IRQ, because the guest doesn't de-assert the line until step #4, and doesn't expect the interrupt to be re-enabled until step #5. Unfortunately, simply "fixing" the bug isn't an option, as KVM has no way of knowing if the userspace I/O APIC supports directed EOIs, i.e. suppressing EOI broadcasts would result in interrupts being stuck masked in the userspace I/O APIC due to step #5 being ignored by userspace. And fully disabling support for Suppress EOI Broadcast is also undesirable, as picking up the fix would require a guest reboot, *and* more importantly would change the virtual CPU model exposed to the guest without any buy-in from userspace. Add two flags to allow userspace to choose exactly how to solve the immediate issue, and in the long term to allow userspace to control the virtual CPU model that is exposed to the guest (KVM should never have enabled supported for Supress EOI Broadcast without a userspace opt-in). Note, Suppress EOI Broadcasts is defined only in Intel's SDM, not in AMD's APM. But the bit is writable on some AMD CPUs, e.g. Turin, and KVM's ABI is to support Directed EOI (KVM's name) irrespective of guest CPU vendor. Fixes: 7543a635aa09 ("KVM: x86: Add KVM exit for IOAPIC EOIs") Closes: https://lore.kernel.org/kvm/7D497EF1-607D-4D37-98E7-DAF95F099342@nutanix.com Cc: stable(a)vger.kernel.org Co-developed-by: Sean Christopherson <seanjc(a)google.com> Signed-off-by: Sean Christopherson <seanjc(a)google.com> Signed-off-by: Khushit Shah <khushit.shah(a)nutanix.com> --- Documentation/virt/kvm/api.rst | 14 ++++++++++++-- arch/x86/include/asm/kvm_host.h | 2 ++ arch/x86/include/uapi/asm/kvm.h | 6 ++++-- arch/x86/kvm/lapic.c | 13 +++++++++++++ arch/x86/kvm/x86.c | 12 +++++++++--- 5 files changed, 40 insertions(+), 7 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 57061fa29e6a..4141d2bd8156 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -7800,8 +7800,10 @@ Will return -EBUSY if a VCPU has already been created. Valid feature flags in args[0] are:: - #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) - #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) + #define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) + #define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) + #define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3) Enabling KVM_X2APIC_API_USE_32BIT_IDS changes the behavior of KVM_SET_GSI_ROUTING, KVM_SIGNAL_MSI, KVM_SET_LAPIC, and KVM_GET_LAPIC, @@ -7814,6 +7816,14 @@ as a broadcast even in x2APIC mode in order to support physical x2APIC without interrupt remapping. This is undesirable in logical mode, where 0xff represents CPUs 0-7 in cluster 0. +Setting KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK overrides +KVM's quirky behavior of not actually suppressing EOI broadcasts for split IRQ +chips when support for Suppress EOI Broadcasts is advertised to the guest. + +Setting KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST disables support for +Suppress EOI Broadcasts entirely, i.e. instructs KVM to NOT advertise support +to the guest and thus disallow enabling EOI broadcast suppression in SPIV. + 7.8 KVM_CAP_S390_USER_INSTR0 ---------------------------- diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 48598d017d6f..f6fdc0842c05 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1480,6 +1480,8 @@ struct kvm_arch { bool x2apic_format; bool x2apic_broadcast_quirk_disabled; + bool disable_ignore_suppress_eoi_broadcast_quirk; + bool x2apic_disable_suppress_eoi_broadcast; bool has_mapped_host_mmio; bool guest_can_read_msr_platform_info; diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h index d420c9c066d4..82d49696118f 100644 --- a/arch/x86/include/uapi/asm/kvm.h +++ b/arch/x86/include/uapi/asm/kvm.h @@ -913,8 +913,10 @@ struct kvm_sev_snp_launch_finish { __u64 pad1[4]; }; -#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) -#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_USE_32BIT_IDS (1ULL << 0) +#define KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK (1ULL << 1) +#define KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK (1ULL << 2) +#define KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST (1ULL << 3) struct kvm_hyperv_eventfd { __u32 conn_id; diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index 0ae7f913d782..cf8a2162872b 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -562,6 +562,7 @@ void kvm_apic_set_version(struct kvm_vcpu *vcpu) * IOAPIC. */ if (guest_cpu_cap_has(vcpu, X86_FEATURE_X2APIC) && + !vcpu->kvm->arch.x2apic_disable_suppress_eoi_broadcast && !ioapic_in_kernel(vcpu->kvm)) v |= APIC_LVR_DIRECTED_EOI; kvm_lapic_set_reg(apic, APIC_LVR, v); @@ -1517,6 +1518,18 @@ static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector) /* Request a KVM exit to inform the userspace IOAPIC. */ if (irqchip_split(apic->vcpu->kvm)) { + /* + * Don't exit to userspace if the guest has enabled Directed + * EOI, a.k.a. Suppress EOI Broadcasts, in which case the local + * APIC doesn't broadcast EOIs (the guest must EOI the target + * I/O APIC(s) directly). Ignore the suppression if userspace + * has NOT disabled KVM's quirk (KVM advertised support for + * Suppress EOI Broadcasts without actually suppressing EOIs). + */ + if ((kvm_lapic_get_reg(apic, APIC_SPIV) & APIC_SPIV_DIRECTED_EOI) && + apic->vcpu->kvm->arch.disable_ignore_suppress_eoi_broadcast_quirk) + return; + apic->vcpu->arch.pending_ioapic_eoi = vector; kvm_make_request(KVM_REQ_IOAPIC_EOI_EXIT, apic->vcpu); return; diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index c9c2aa6f4705..e1b6fe783615 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -121,8 +121,11 @@ static u64 __read_mostly efer_reserved_bits = ~((u64)EFER_SCE); #define KVM_CAP_PMU_VALID_MASK KVM_PMU_CAP_DISABLE -#define KVM_X2APIC_API_VALID_FLAGS (KVM_X2APIC_API_USE_32BIT_IDS | \ - KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) +#define KVM_X2APIC_API_VALID_FLAGS \ + (KVM_X2APIC_API_USE_32BIT_IDS | \ + KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK | \ + KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK | \ + KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST) static void update_cr8_intercept(struct kvm_vcpu *vcpu); static void process_nmi(struct kvm_vcpu *vcpu); @@ -6782,7 +6785,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, kvm->arch.x2apic_format = true; if (cap->args[0] & KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK) kvm->arch.x2apic_broadcast_quirk_disabled = true; - + if (cap->args[0] & KVM_X2APIC_API_DISABLE_IGNORE_SUPPRESS_EOI_BROADCAST_QUIRK) + kvm->arch.disable_ignore_suppress_eoi_broadcast_quirk = true; + if (cap->args[0] & KVM_X2APIC_API_DISABLE_SUPPRESS_EOI_BROADCAST) + kvm->arch.x2apic_disable_suppress_eoi_broadcast = true; r = 0; break; case KVM_CAP_X86_DISABLE_EXITS: -- 2.39.3

5 days, 21 hours

1
0
0 0

[PATCH] LoongArch: Add new PCI ID for pci_fixup_vgadev()

by Huacai Chen

Loongson-2K3000 has a new PCI ID (0x7a46) for its display controller, Add it for pci_fixup_vgadev() since we prefer a discrete graphics card as default boot device if present. Cc: stable(a)vger.kernel.org Signed-off-by: Tianrui Zhao <zhaotianrui(a)loongson.cn> Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> --- arch/loongarch/pci/pci.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/loongarch/pci/pci.c b/arch/loongarch/pci/pci.c index d9fc5d520b37..d923295ab8c6 100644 --- a/arch/loongarch/pci/pci.c +++ b/arch/loongarch/pci/pci.c @@ -14,6 +14,7 @@ #define PCI_DEVICE_ID_LOONGSON_HOST 0x7a00 #define PCI_DEVICE_ID_LOONGSON_DC1 0x7a06 #define PCI_DEVICE_ID_LOONGSON_DC2 0x7a36 +#define PCI_DEVICE_ID_LOONGSON_DC3 0x7a46 int raw_pci_read(unsigned int domain, unsigned int bus, unsigned int devfn, int reg, int len, u32 *val) @@ -97,3 +98,4 @@ static void pci_fixup_vgadev(struct pci_dev *pdev) } DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC1, pci_fixup_vgadev); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC2, pci_fixup_vgadev); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_LOONGSON, PCI_DEVICE_ID_LOONGSON_DC3, pci_fixup_vgadev); -- 2.47.3

5 days, 21 hours

1
0
0 0

[PATCH v1] phy: fsl-imx8mq-usb: fix typec orientation switch when built as module

by Franz Schnyder

From: Franz Schnyder <franz.schnyder(a)toradex.com> Currently, the PHY only registers the typec orientation switch when it is built in. If the typec driver is built as a module, the switch registration is skipped due to the preprocessor condition, causing orientation detection to fail. This patch replaces the preprocessor condition so that the orientation switch is correctly registered for both built-in and module builds. Fixes: b58f0f86fd61 ("phy: fsl-imx8mq-usb: add tca function driver for imx95") Cc: stable(a)vger.kernel.org Signed-off-by: Franz Schnyder <franz.schnyder(a)toradex.com> --- drivers/phy/freescale/phy-fsl-imx8mq-usb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c index b94f242420fc..d498a6b7234b 100644 --- a/drivers/phy/freescale/phy-fsl-imx8mq-usb.c +++ b/drivers/phy/freescale/phy-fsl-imx8mq-usb.c @@ -124,7 +124,7 @@ struct imx8mq_usb_phy { static void tca_blk_orientation_set(struct tca_blk *tca, enum typec_orientation orientation); -#ifdef CONFIG_TYPEC +#if IS_ENABLED(CONFIG_TYPEC) static int tca_blk_typec_switch_set(struct typec_switch_dev *sw, enum typec_orientation orientation) -- 2.43.0

5 days, 22 hours

4
4
0 0

[PATCH] [SCSI] hptiop: Add inbound queue offset bounds check in iop_get_config_itl

by Guangshuo Li

In the driver code for the MV‑based queue variant (struct hpt_iopmu_mv of the hptiop driver), the field "inbound_head" is read from the hardware register and used as an index into the array "inbound_q[MVIOP_QUEUE_LEN]". For example: u32 inbound_head = readl(&hba->u.mv.mu->inbound_head); /* ... */ memcpy_toio(&hba->u.mv.mu->inbound_q[inbound_head], &p, 8); The code then increments head and wraps it to zero when it equals MVIOP_QUEUE_LEN. However, the driver does *not* check that the initial value of "inbound_head" is strictly less than "MVIOP_QUEUE_LEN". If the hardware (or attacker‑controlled firmware/hardware device) writes a malicious value into the inbound_head register (which could be ≥ MVIOP_QUEUE_LEN), then subsequent "memcpy_toio" will write past the end of "inbound_q", leading to an out‑of‑bounds write condition. Since inbound_q is allocated with exactly MVIOP_QUEUE_LEN entries (see: __le64 inbound_q[MVIOP_QUEUE_LEN]; /* MVIOP_QUEUE_LEN == 512 */ ), indexing at e.g. "inbound_head == 512" or greater results in undefined memory access and potential corruption of adjacent fields or memory regions. This issue is particularly concerning in scenarios where an attacker has control or influence over the hardware/firmware on the adapter card (for example a malicious or compromised controller), because they could deliberately set "inbound_head" to a value outside the expected [0, MVIOP_QUEUE_LEN‑1] range, thus forcing the driver to write arbitrary data beyond the queue bounds. To mitigate this issue, we add a check to validate the value of "inbound_head" before it is used as an index. If "inbound_head" is found to be out of bounds (≥ MVIOP_QUEUE_LEN), the head will be reset to 0, and "head" will be set to 1 to ensure that a valid entry is written to the queue. The resetting of "inbound_head" to 0 ensures that the queue processing can continue safely and predictably, while the adjustment of "head = 1" ensures that the next valid index is used for subsequent writes. This prevents any out-of-bounds writes and ensures that the queue continues to operate safely even if the hardware is compromised. Fixes: 00f5970193e22 ("[SCSI] hptiop: add more adapter models and other fixes") Cc: stable(a)vger.kernel.org Signed-off-by: Guangshuo Li <lgs201920130244(a)gmail.com> --- drivers/scsi/hptiop.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/scsi/hptiop.c b/drivers/scsi/hptiop.c index c01370893a81..a1a3840e6ea8 100644 --- a/drivers/scsi/hptiop.c +++ b/drivers/scsi/hptiop.c @@ -166,6 +166,14 @@ static void mv_inbound_write(u64 p, struct hptiop_hba *hba) if (head == MVIOP_QUEUE_LEN) head = 0; + if (inbound_head >= MVIOP_QUEUE_LEN) { + dev_err(&hba->pdev->dev, + "hptiop: inbound_head out of range (%u)\n", + inbound_head); + inbound_head = 0; + head = 1; + } + memcpy_toio(&hba->u.mv.mu->inbound_q[inbound_head], &p, 8); writel(head, &hba->u.mv.mu->inbound_head); writel(MVIOP_MU_INBOUND_INT_POSTQUEUE, -- 2.43.0

5 days, 23 hours

2
1
0 0

[PATCH V1 5.4.y 0/2] Fix bad pmd due to race between change_prot_numa() and THP migration

by Harry Yoo

# TL;DR previous discussion: https://lore.kernel.org/linux-mm/20250921232709.1608699-1-harry.yoo@oracle.… A "bad pmd" error occurs due to race condition between change_prot_numa() and THP migration. The mainline kernel does not have this bug as commit 670ddd8cdc fixes the race condition. 6.1.y, 5.15.y, 5.10.y, 5.4.y are affected by this bug. Fixing this in -stable kernels is tricky because pte_map_offset_lock() has different semantics in pre-6.5 and post-6.5 kernels. I am trying to backport the same mechanism we have in the mainline kernel. # Testing I verified that the bug described below is not reproduced anymore (on a downstream kernel) after applying this patch series. It used to trigger in few days of intensive numa balancing testing, but it survived 2 weeks with this applied. # Bug Description It was reported that a bad pmd is seen when automatic NUMA balancing is marking page table entries as prot_numa: [2437548.196018] mm/pgtable-generic.c:50: bad pmd 00000000af22fc02(dffffffe71fbfe02) [2437548.235022] Call Trace: [2437548.238234] <TASK> [2437548.241060] dump_stack_lvl+0x46/0x61 [2437548.245689] panic+0x106/0x2e5 [2437548.249497] pmd_clear_bad+0x3c/0x3c [2437548.253967] change_pmd_range.isra.0+0x34d/0x3a7 [2437548.259537] change_p4d_range+0x156/0x20e [2437548.264392] change_protection_range+0x116/0x1a9 [2437548.269976] change_prot_numa+0x15/0x37 [2437548.274774] task_numa_work+0x1b8/0x302 [2437548.279512] task_work_run+0x62/0x95 [2437548.283882] exit_to_user_mode_loop+0x1a4/0x1a9 [2437548.289277] exit_to_user_mode_prepare+0xf4/0xfc [2437548.294751] ? sysvec_apic_timer_interrupt+0x34/0x81 [2437548.300677] irqentry_exit_to_user_mode+0x5/0x25 [2437548.306153] asm_sysvec_apic_timer_interrupt+0x16/0x1b This is due to a race condition between change_prot_numa() and THP migration because the kernel doesn't check is_swap_pmd() and pmd_trans_huge() atomically: change_prot_numa() THP migration ====================================================================== - change_pmd_range() -> is_swap_pmd() returns false, meaning it's not a PMD migration entry. - do_huge_pmd_numa_page() -> migrate_misplaced_page() sets migration entries for the THP. - change_pmd_range() -> pmd_none_or_clear_bad_unless_trans_huge() -> pmd_none() and pmd_trans_huge() returns false - pmd_none_or_clear_bad_unless_trans_huge() -> pmd_bad() returns true for the migration entry! The upstream commit 670ddd8cdcbd ("mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge()") closes this race condition by checking is_swap_pmd() and pmd_trans_huge() atomically. # Backporting note commit a79390f5d6a7 ("mm/mprotect: use long for page accountings and retval") is backported to return an error code (negative value) in change_pte_range(). Unlike the mainline, pte_offset_map_lock() does not check if the pmd entry is a migration entry or a hugepage; acquires PTL unconditionally instead of returning failure. Therefore, it is necessary to keep the !is_swap_pmd() && !pmd_trans_huge() && !pmd_devmap() checks in change_pmd_range() before acquiring the PTL. After acquiring the lock, open-code the semantics of pte_offset_map_lock() in the mainline kernel; change_pte_range() fails if the pmd value has changed. This requires adding pmd_old parameter (pmd_t value that is read before calling the function) to change_pte_range(). Hugh Dickins (1): mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge() Peter Xu (1): mm/mprotect: use long for page accountings and retval include/linux/hugetlb.h | 4 +- include/linux/mm.h | 2 +- mm/hugetlb.c | 4 +- mm/mempolicy.c | 2 +- mm/mprotect.c | 124 +++++++++++++++++----------------------- 5 files changed, 60 insertions(+), 76 deletions(-) -- 2.43.0

6 days, 1 hour

1
2
0 0

[PATCH V1 5.10.y 0/2] Fix bad pmd due to race between change_prot_numa() and THP migration

by Harry Yoo

# TL;DR previous discussion: https://lore.kernel.org/linux-mm/20250921232709.1608699-1-harry.yoo@oracle.… A "bad pmd" error occurs due to race condition between change_prot_numa() and THP migration. The mainline kernel does not have this bug as commit 670ddd8cdc fixes the race condition. 6.1.y, 5.15.y, 5.10.y, 5.4.y are affected by this bug. Fixing this in -stable kernels is tricky because pte_map_offset_lock() has different semantics in pre-6.5 and post-6.5 kernels. I am trying to backport the same mechanism we have in the mainline kernel. # Testing I verified that the bug described below is not reproduced anymore (on a downstream kernel) after applying this patch series. It used to trigger in few days of intensive numa balancing testing, but it survived 2 weeks with this applied. # Bug Description It was reported that a bad pmd is seen when automatic NUMA balancing is marking page table entries as prot_numa: [2437548.196018] mm/pgtable-generic.c:50: bad pmd 00000000af22fc02(dffffffe71fbfe02) [2437548.235022] Call Trace: [2437548.238234] <TASK> [2437548.241060] dump_stack_lvl+0x46/0x61 [2437548.245689] panic+0x106/0x2e5 [2437548.249497] pmd_clear_bad+0x3c/0x3c [2437548.253967] change_pmd_range.isra.0+0x34d/0x3a7 [2437548.259537] change_p4d_range+0x156/0x20e [2437548.264392] change_protection_range+0x116/0x1a9 [2437548.269976] change_prot_numa+0x15/0x37 [2437548.274774] task_numa_work+0x1b8/0x302 [2437548.279512] task_work_run+0x62/0x95 [2437548.283882] exit_to_user_mode_loop+0x1a4/0x1a9 [2437548.289277] exit_to_user_mode_prepare+0xf4/0xfc [2437548.294751] ? sysvec_apic_timer_interrupt+0x34/0x81 [2437548.300677] irqentry_exit_to_user_mode+0x5/0x25 [2437548.306153] asm_sysvec_apic_timer_interrupt+0x16/0x1b This is due to a race condition between change_prot_numa() and THP migration because the kernel doesn't check is_swap_pmd() and pmd_trans_huge() atomically: change_prot_numa() THP migration ====================================================================== - change_pmd_range() -> is_swap_pmd() returns false, meaning it's not a PMD migration entry. - do_huge_pmd_numa_page() -> migrate_misplaced_page() sets migration entries for the THP. - change_pmd_range() -> pmd_none_or_clear_bad_unless_trans_huge() -> pmd_none() and pmd_trans_huge() returns false - pmd_none_or_clear_bad_unless_trans_huge() -> pmd_bad() returns true for the migration entry! The upstream commit 670ddd8cdcbd ("mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge()") closes this race condition by checking is_swap_pmd() and pmd_trans_huge() atomically. # Backporting note commit a79390f5d6a7 ("mm/mprotect: use long for page accountings and retval") is backported to return an error code (negative value) in change_pte_range(). Unlike the mainline, pte_offset_map_lock() does not check if the pmd entry is a migration entry or a hugepage; acquires PTL unconditionally instead of returning failure. Therefore, it is necessary to keep the !is_swap_pmd() && !pmd_trans_huge() && !pmd_devmap() checks in change_pmd_range() before acquiring the PTL. After acquiring the lock, open-code the semantics of pte_offset_map_lock() in the mainline kernel; change_pte_range() fails if the pmd value has changed. This requires adding pmd_old parameter (pmd_t value that is read before calling the function) to change_pte_range(). Hugh Dickins (1): mm/mprotect: delete pmd_none_or_clear_bad_unless_trans_huge() Peter Xu (1): mm/mprotect: use long for page accountings and retval include/linux/hugetlb.h | 4 +- include/linux/mm.h | 2 +- mm/hugetlb.c | 4 +- mm/mempolicy.c | 2 +- mm/mprotect.c | 107 ++++++++++++++++++++++------------------ 5 files changed, 64 insertions(+), 55 deletions(-) -- 2.43.0

6 days, 1 hour

1
2
0 0

[PATCH] media: venus: vdec: restrict EOS addr quirk to IRIS2 only

by Dikshita Agarwal

On SM8250 (IRIS2) with firmware older than 1.0.087, the firmware could not handle a dummy device address for EOS buffers, so a NULL device address is sent instead. The existing check used IS_V6() alongside a firmware version gate: if (IS_V6(core) && is_fw_rev_or_older(core, 1, 0, 87)) fdata.device_addr = 0; else fdata.device_addr = 0xdeadb000; However, SC7280 which is also V6, uses a firmware string of the form "1.0.<commit-hash>", which the version parser translates to 1.0.0. This unintentionally satisfies the `is_fw_rev_or_older(..., 1, 0, 87)` condition on SC7280. Combined with IS_V6() matching there as well, the quirk is incorrectly applied to SC7280, causing VP9 decode failures. Constrain the check to IRIS2 (SM8250) only, which is the only platform that needed this quirk, by replacing IS_V6() with IS_IRIS2(). This restores correct behavior on SC7280 (no forced NULL EOS buffer address). Fixes: 47f867cb1b63 ("media: venus: fix EOS handling in decoder stop command") Cc: stable(a)vger.kernel.org Reported-by: Mecid <notifications(a)github.com> Closes: https://github.com/qualcomm-linux/kernel-topics/issues/222 Co-developed-by: Renjiang Han <renjiang.han(a)oss.qualcomm.com> Signed-off-by: Renjiang Han <renjiang.han(a)oss.qualcomm.com> Signed-off-by: Dikshita Agarwal <dikshita.agarwal(a)oss.qualcomm.com> --- drivers/media/platform/qcom/venus/vdec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/media/platform/qcom/venus/vdec.c b/drivers/media/platform/qcom/venus/vdec.c index 4a6641fdffcf79705893be58c7ec5cf485e2fab9..dc85a5b8c989eb8339e5de9fea7ab49532e7f15a 100644 --- a/drivers/media/platform/qcom/venus/vdec.c +++ b/drivers/media/platform/qcom/venus/vdec.c @@ -565,7 +565,7 @@ vdec_decoder_cmd(struct file *file, void *fh, struct v4l2_decoder_cmd *cmd) fdata.buffer_type = HFI_BUFFER_INPUT; fdata.flags |= HFI_BUFFERFLAG_EOS; - if (IS_V6(inst->core) && is_fw_rev_or_older(inst->core, 1, 0, 87)) + if (IS_IRIS2(inst->core) && is_fw_rev_or_older(inst->core, 1, 0, 87)) fdata.device_addr = 0; else fdata.device_addr = 0xdeadb000; --- base-commit: 1f2353f5a1af995efbf7bea44341aa0d03460b28 change-id: 20251121-venus-vp9-fix-1ff602724c02 Best regards, -- Dikshita Agarwal <dikshita.agarwal(a)oss.qualcomm.com>

6 days, 1 hour

2
2
0 0

FAILED: patch "[PATCH] mptcp: fix address removal logic in mptcp_pm_nl_rm_addr" failed to apply to 6.17-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 6.17-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.17.y git checkout FETCH_HEAD git cherry-pick -x 92e239e36d600002559074994a545fcfac9afd2d # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112435-stray-aflutter-0f77@gregkh' --subject-prefix 'PATCH 6.17.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 92e239e36d600002559074994a545fcfac9afd2d Mon Sep 17 00:00:00 2001 From: Gang Yan <yangang(a)kylinos.cn> Date: Tue, 18 Nov 2025 08:20:28 +0100 Subject: [PATCH] mptcp: fix address removal logic in mptcp_pm_nl_rm_addr Fix inverted WARN_ON_ONCE condition that prevented normal address removal counter updates. The current code only executes decrement logic when the counter is already 0 (abnormal state), while normal removals (counter > 0) are ignored. Signed-off-by: Gang Yan <yangang(a)kylinos.cn> Fixes: 636113918508 ("mptcp: pm: remove '_nl' from mptcp_pm_nl_rm_addr_received") Cc: stable(a)vger.kernel.org Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-10-806d3… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/net/mptcp/pm_kernel.c b/net/mptcp/pm_kernel.c index 2ae95476dba3..0a50fd5edc06 100644 --- a/net/mptcp/pm_kernel.c +++ b/net/mptcp/pm_kernel.c @@ -672,7 +672,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk) void mptcp_pm_nl_rm_addr(struct mptcp_sock *msk, u8 rm_id) { - if (rm_id && WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { + if (rm_id && !WARN_ON_ONCE(msk->pm.add_addr_accepted == 0)) { u8 limit_add_addr_accepted = mptcp_pm_get_limit_add_addr_accepted(msk);

6 days, 2 hours

2
1
0 0

FAILED: patch "[PATCH] mptcp: do not fallback when OoO is present" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 1bba3f219c5e8c29e63afa3c1fc24f875ebec119 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112426-backroom-negate-d125@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 1bba3f219c5e8c29e63afa3c1fc24f875ebec119 Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Tue, 18 Nov 2025 08:20:22 +0100 Subject: [PATCH] mptcp: do not fallback when OoO is present In case of DSS corruption, the MPTCP protocol tries to avoid the subflow reset if fallback is possible. Such corruptions happen in the receive path; to ensure fallback is possible the stack additionally needs to check for OoO data, otherwise the fallback will break the data stream. Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption") Cc: stable(a)vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598 Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-4-806d37… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e30e9043a694..6f0e8f670d83 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -76,6 +76,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib) if (__mptcp_check_fallback(msk)) return true; + /* The caller possibly is not holding the msk socket lock, but + * in the fallback case only the current subflow is touching + * the OoO queue. + */ + if (!RB_EMPTY_ROOT(&msk->out_of_order_queue)) + return false; + spin_lock_bh(&msk->fallback_lock); if (!msk->allow_infinite_fallback) { spin_unlock_bh(&msk->fallback_lock);

6 days, 2 hours

2
1
0 0

FAILED: patch "[PATCH] mptcp: do not fallback when OoO is present" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 1bba3f219c5e8c29e63afa3c1fc24f875ebec119 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025112425-math-lasso-c3b8@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 1bba3f219c5e8c29e63afa3c1fc24f875ebec119 Mon Sep 17 00:00:00 2001 From: Paolo Abeni <pabeni(a)redhat.com> Date: Tue, 18 Nov 2025 08:20:22 +0100 Subject: [PATCH] mptcp: do not fallback when OoO is present In case of DSS corruption, the MPTCP protocol tries to avoid the subflow reset if fallback is possible. Such corruptions happen in the receive path; to ensure fallback is possible the stack additionally needs to check for OoO data, otherwise the fallback will break the data stream. Fixes: e32d262c89e2 ("mptcp: handle consistently DSS corruption") Cc: stable(a)vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/598 Signed-off-by: Paolo Abeni <pabeni(a)redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org> Link: https://patch.msgid.link/20251118-net-mptcp-misc-fixes-6-18-rc6-v1-4-806d37… Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index e30e9043a694..6f0e8f670d83 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -76,6 +76,13 @@ bool __mptcp_try_fallback(struct mptcp_sock *msk, int fb_mib) if (__mptcp_check_fallback(msk)) return true; + /* The caller possibly is not holding the msk socket lock, but + * in the fallback case only the current subflow is touching + * the OoO queue. + */ + if (!RB_EMPTY_ROOT(&msk->out_of_order_queue)) + return false; + spin_lock_bh(&msk->fallback_lock); if (!msk->allow_infinite_fallback) { spin_unlock_bh(&msk->fallback_lock);

6 days, 3 hours

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror