August 2025 - Linux-stable-mirror

[PATCH v4] x86/cpu/intel: Fix the constant_tsc model check for Pentium 4

by Suchit Karunakaran

Pentium 4's which are INTEL_P4_PRESCOTT (model 0x03) and later have a constant TSC. This was correctly captured until commit fadb6f569b10 ("x86/cpu/intel: Limit the non-architectural constant_tsc model checks"). In that commit, an error was introduced while selecting the last P4 model (0x06) as the upper bound. Model 0x06 was transposed to INTEL_P4_WILLAMETTE, which is just plain wrong. That was presumably a simple typo, probably just copying and pasting the wrong P4 model. Fix the constant TSC logic to cover all later P4 models. End at INTEL_P4_CEDARMILL which accurately corresponds to the last P4 model. Fixes: fadb6f569b10 ("x86/cpu/intel: Limit the non-architectural constant_tsc model checks") Cc: <stable(a)vger.kernel.org> # v6.15 Signed-off-by: Suchit Karunakaran <suchitkarunakaran(a)gmail.com> Changes since v3: - Refined changelog Changes since v2: - Improve commit message Changes since v1: - Fix incorrect logic --- arch/x86/kernel/cpu/intel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index 076eaa41b8c8..6f5bd5dbc249 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -262,7 +262,7 @@ static void early_init_intel(struct cpuinfo_x86 *c) if (c->x86_power & (1 << 8)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC); - } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_WILLAMETTE) || + } else if ((c->x86_vfm >= INTEL_P4_PRESCOTT && c->x86_vfm <= INTEL_P4_CEDARMILL) || (c->x86_vfm >= INTEL_CORE_YONAH && c->x86_vfm <= INTEL_IVYBRIDGE)) { set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC); } -- 2.50.1

4 months, 3 weeks

2
2
0 0

[PATCH net v2] net/packet: fix a race in packet_set_ring() and packet_notifier()

by Willem de Bruijn

From: Quang Le <quanglex97(a)gmail.com> When packet_set_ring() releases po->bind_lock, another thread can run packet_notifier() and process an NETDEV_UP event. This race and the fix are both similar to that of commit 15fe076edea7 ("net/packet: fix a race in packet_bind() and packet_notifier()"). There too the packet_notifier NETDEV_UP event managed to run while a po->bind_lock critical section had to be temporarily released. And the fix was similarly to temporarily set po->num to zero to keep the socket unhooked until the lock is retaken. The po->bind_lock in packet_set_ring and packet_notifier precede the introduction of git history. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable(a)vger.kernel.org Signed-off-by: Quang Le <quanglex97(a)gmail.com> Signed-off-by: Willem de Bruijn <willemb(a)google.com> --- v1->v2: - fix author attribution (From: at the top) v1: https://lore.kernel.org/netdev/20250731175132.2592130-1-willemdebruijn.kern… --- net/packet/af_packet.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index bc438d0d96a7..a7017d7f0927 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -4573,10 +4573,10 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, spin_lock(&po->bind_lock); was_running = packet_sock_flag(po, PACKET_SOCK_RUNNING); num = po->num; - if (was_running) { - WRITE_ONCE(po->num, 0); + WRITE_ONCE(po->num, 0); + if (was_running) __unregister_prot_hook(sk, false); - } + spin_unlock(&po->bind_lock); synchronize_net(); @@ -4608,10 +4608,10 @@ static int packet_set_ring(struct sock *sk, union tpacket_req_u *req_u, mutex_unlock(&po->pg_vec_lock); spin_lock(&po->bind_lock); - if (was_running) { - WRITE_ONCE(po->num, num); + WRITE_ONCE(po->num, num); + if (was_running) register_prot_hook(sk); - } + spin_unlock(&po->bind_lock); if (pg_vec && (po->tp_version > TPACKET_V2)) { /* Because we don't support block-based V3 on tx-ring */ -- 2.50.1.565.gc32cd1483b-goog

4 months, 3 weeks

2
1
0 0

[PATCH v2 0/4] i2c: rtl9300: Fix multi-byte I2C operations

by Sven Eckelmann

During the integration of the RTL8239 POE chip + its frontend MCU, it was noticed that multi-byte operations were basically broken in the current driver. Tests using SMBus Block Writes showed that the data (after the Wr + Ack marker) was mixed up on the wire. At first glance, it looked like an endianness problem. But for transfers were the number of count + data bytes was not divisible by 4, the last bytes were not looking like an endianness problem because they were were in the wrong order but not for example 0 - which would be the case for an endianness problem with 32 bit registers. At the end, it turned out to be a the way how i2c_write tried to add the bytes to the send registers. Each 32 bit register was used similar to a shift register - shifting the various bytes up the register while the next one is added to the least significant byte. But the I2C controller expects the first byte of the tranmission in the least significant byte of the first register. And the last byte (assuming it is a 16 byte transfer) in the most significant byte of the fourth register. While doing these tests, it was also observed that the count byte was missing from the SMBus Block Writes. The driver just removed them from the data->block (from the I2C subsystem). But the I2C controller DOES NOT automatically add this byte - for example by using the configured transmission length. The RTL8239 MCU is not actually an SMBus compliant device. Instead, it expects I2C Block Reads + I2C Block Writes. But according to the already identified bugs in the driver, it was clear that the I2C controller can simply be modified to not send the count byte for I2C_SMBUS_I2C_BLOCK_DATA. The receive part, just needs to write the content of the receive buffer to the correct position in data->block. While the on-wire formwat was now correct, reads were still not possible against the MCU (for the RTL8239 POE chip). It was always timing out because the 2ms were not enough for sending the read request and then receiving the 12 byte answer. These changes were originally submitted to OpenWrt. But there are plans to migrate OpenWrt to the upstream Linux driver. As result, the pull request was stopped and the changes were redone against this driver. For reasons of transparency: The work on I2C_SMBUS_I2C_BLOCK_DATA support for the RTL8239-MCU was done on RTL931xx. All problem were therefore detected with the patches from Jonas Jelonek [1] and not the vanilla Linux driver. But looking through the code, it seems like these are NOT regressions introduced by the RTL931x patchset. [1] https://patchwork.ozlabs.org/project/linux-i2c/cover/20250727114800.3046-1-… Signed-off-by: Sven Eckelmann <sven(a)narfation.org> --- Changes in v2: - add the missing transfer width and read length increase for the SMBus Write/Read - Link to v1: https://lore.kernel.org/r/20250802-i2c-rtl9300-multi-byte-v1-0-5f687e0098e2… --- Harshal Gohel (2): i2c: rtl9300: Fix multi-byte I2C write i2c: rtl9300: Implement I2C block read and write Sven Eckelmann (2): i2c: rtl9300: Increase timeout for transfer polling i2c: rtl9300: Add missing count byte for SMBus Block Ops drivers/i2c/busses/i2c-rtl9300.c | 43 +++++++++++++++++++++++++++++++++------- 1 file changed, 36 insertions(+), 7 deletions(-) --- base-commit: b9ddaa95fd283bce7041550ddbbe7e764c477110 change-id: 20250802-i2c-rtl9300-multi-byte-edaa1fb0872c Best regards, -- Sven Eckelmann <sven(a)narfation.org>

4 months, 4 weeks

2
7
0 0

[PATCH v3 0/5] i2c: rtl9300: Fix multi-byte I2C operations

by Sven Eckelmann

During the integration of the RTL8239 POE chip + its frontend MCU, it was noticed that multi-byte operations were basically broken in the current driver. Tests using SMBus Block Writes showed that the data (after the Wr + Ack marker) was mixed up on the wire. At first glance, it looked like an endianness problem. But for transfers were the number of count + data bytes was not divisible by 4, the last bytes were not looking like an endianness problem because they were were in the wrong order but not for example 0 - which would be the case for an endianness problem with 32 bit registers. At the end, it turned out to be a the way how i2c_write tried to add the bytes to the send registers. Each 32 bit register was used similar to a shift register - shifting the various bytes up the register while the next one is added to the least significant byte. But the I2C controller expects the first byte of the tranmission in the least significant byte of the first register. And the last byte (assuming it is a 16 byte transfer) in the most significant byte of the fourth register. While doing these tests, it was also observed that the count byte was missing from the SMBus Block Writes. The driver just removed them from the data->block (from the I2C subsystem). But the I2C controller DOES NOT automatically add this byte - for example by using the configured transmission length. The RTL8239 MCU is not actually an SMBus compliant device. Instead, it expects I2C Block Reads + I2C Block Writes. But according to the already identified bugs in the driver, it was clear that the I2C controller can simply be modified to not send the count byte for I2C_SMBUS_I2C_BLOCK_DATA. The receive part, just needs to write the content of the receive buffer to the correct position in data->block. While the on-wire formwat was now correct, reads were still not possible against the MCU (for the RTL8239 POE chip). It was always timing out because the 2ms were not enough for sending the read request and then receiving the 12 byte answer. These changes were originally submitted to OpenWrt. But there are plans to migrate OpenWrt to the upstream Linux driver. As result, the pull request was stopped and the changes were redone against this driver. For reasons of transparency: The work on I2C_SMBUS_I2C_BLOCK_DATA support for the RTL8239-MCU was done on RTL931xx. All problem were therefore detected with the patches from Jonas Jelonek [1] and not the vanilla Linux driver. But looking through the code, it seems like these are NOT regressions introduced by the RTL931x patchset. I've picked up Alex Guo's patch [2] to reduce conflicts between pending fixes. [1] https://patchwork.ozlabs.org/project/linux-i2c/cover/20250727114800.3046-1-… [2] https://lore.kernel.org/r/20250615235248.529019-1-alexguo1023@gmail.com Signed-off-by: Sven Eckelmann <sven(a)narfation.org> --- Changes in v3: - integrated patch https://lore.kernel.org/r/20250615235248.529019-1-alexguo1023@gmail.com to avoid conflicts in the I2C_SMBUS_BLOCK_DATA code - added Fixes and stable(a)vger.kernel.org to Alex Guo's patch - added Chris Packham's Reviewed-by/Acked-by - Link to v2: https://lore.kernel.org/r/20250803-i2c-rtl9300-multi-byte-v2-0-9b7b759fe2b6… Changes in v2: - add the missing transfer width and read length increase for the SMBus Write/Read - Link to v1: https://lore.kernel.org/r/20250802-i2c-rtl9300-multi-byte-v1-0-5f687e0098e2… --- Alex Guo (1): i2c: rtl9300: Fix out-of-bounds bug in rtl9300_i2c_smbus_xfer Harshal Gohel (2): i2c: rtl9300: Fix multi-byte I2C write i2c: rtl9300: Implement I2C block read and write Sven Eckelmann (2): i2c: rtl9300: Increase timeout for transfer polling i2c: rtl9300: Add missing count byte for SMBus Block Ops drivers/i2c/busses/i2c-rtl9300.c | 51 ++++++++++++++++++++++++++++++++++------ 1 file changed, 44 insertions(+), 7 deletions(-) --- base-commit: 0ae982df67760cd08affa935c0fe86c8a9311797 change-id: 20250802-i2c-rtl9300-multi-byte-edaa1fb0872c Best regards, -- Sven Eckelmann <sven(a)narfation.org>

4 months, 4 weeks

1
4
0 0

[PATCH] amdgpu/amdgpu_discovery: increase timeout limit for IFWI init

by Xaver Hugl

With a timeout of only 1 second, my rx 5700XT fails to initialize, so this increases the timeout to 2s. Closes https://gitlab.freedesktop.org/drm/amd/-/issues/3697 Signed-off-by: Xaver Hugl <xaver.hugl(a)kde.org> Cc: stable(a)vger.kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 6d34eac0539d..ae6908b57d78 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c @@ -275,7 +275,7 @@ static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev, int i, ret = 0; if (!amdgpu_sriov_vf(adev)) { - /* It can take up to a second for IFWI init to complete on some dGPUs, + /* It can take up to two seconds for IFWI init to complete on some dGPUs, * but generally it should be in the 60-100ms range. Normally this starts * as soon as the device gets power so by the time the OS loads this has long * completed. However, when a card is hotplugged via e.g., USB4, we need to @@ -283,7 +283,7 @@ static int amdgpu_discovery_read_binary_from_mem(struct amdgpu_device *adev, * continue. */ - for (i = 0; i < 1000; i++) { + for (i = 0; i < 2000; i++) { msg = RREG32(mmMP0_SMN_C2PMSG_33); if (msg & 0x80000000) break; -- 2.50.1

4 months, 4 weeks

2
1
0 0

[PATCH] eventpoll: Fix semi-unbounded recursion

by Jann Horn

Ensure that epoll instances can never form a graph deeper than EP_MAX_NESTS+1 links. Currently, ep_loop_check_proc() ensures that the graph is loop-free and does some recursion depth checks, but those recursion depth checks don't limit the depth of the resulting tree for two reasons: - They don't look upwards in the tree. - If there are multiple downwards paths of different lengths, only one of the paths is actually considered for the depth check since commit 28d82dc1c4ed ("epoll: limit paths"). Essentially, the current recursion depth check in ep_loop_check_proc() just serves to prevent it from recursing too deeply while checking for loops. A more thorough check is done in reverse_path_check() after the new graph edge has already been created; this checks, among other things, that no paths going upwards from any non-epoll file with a length of more than 5 edges exist. However, this check does not apply to non-epoll files. As a result, it is possible to recurse to a depth of at least roughly 500, tested on v6.15. (I am unsure if deeper recursion is possible; and this may have changed with commit 8c44dac8add7 ("eventpoll: Fix priority inversion problem").) To fix it: 1. In ep_loop_check_proc(), note the subtree depth of each visited node, and use subtree depths for the total depth calculation even when a subtree has already been visited. 2. Add ep_get_upwards_depth_proc() for similarly determining the maximum depth of an upwards walk. 3. In ep_loop_check(), use these values to limit the total path length between epoll nodes to EP_MAX_NESTS edges. Fixes: 22bacca48a17 ("epoll: prevent creating circular epoll structures") Cc: stable(a)vger.kernel.org Signed-off-by: Jann Horn <jannh(a)google.com> --- fs/eventpoll.c | 60 ++++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 14 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index d4dbffdedd08..44648cc09250 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -218,6 +218,7 @@ struct eventpoll { /* used to optimize loop detection check */ u64 gen; struct hlist_head refs; + u8 loop_check_depth; /* * usage count, used together with epitem->dying to @@ -2142,23 +2143,24 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, } /** - * ep_loop_check_proc - verify that adding an epoll file inside another - * epoll structure does not violate the constraints, in - * terms of closed loops, or too deep chains (which can - * result in excessive stack usage). + * ep_loop_check_proc - verify that adding an epoll file @ep inside another + * epoll file does not create closed loops, and + * determine the depth of the subtree starting at @ep * * @ep: the &struct eventpoll to be currently checked. * @depth: Current depth of the path being checked. * - * Return: %zero if adding the epoll @file inside current epoll - * structure @ep does not violate the constraints, or %-1 otherwise. + * Return: depth of the subtree, or INT_MAX if we found a loop or went too deep. */ static int ep_loop_check_proc(struct eventpoll *ep, int depth) { - int error = 0; + int result = 0; struct rb_node *rbp; struct epitem *epi; + if (ep->gen == loop_check_gen) + return ep->loop_check_depth; + mutex_lock_nested(&ep->mtx, depth + 1); ep->gen = loop_check_gen; for (rbp = rb_first_cached(&ep->rbr); rbp; rbp = rb_next(rbp)) { @@ -2166,13 +2168,11 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth) if (unlikely(is_file_epoll(epi->ffd.file))) { struct eventpoll *ep_tovisit; ep_tovisit = epi->ffd.file->private_data; - if (ep_tovisit->gen == loop_check_gen) - continue; if (ep_tovisit == inserting_into || depth > EP_MAX_NESTS) - error = -1; + result = INT_MAX; else - error = ep_loop_check_proc(ep_tovisit, depth + 1); - if (error != 0) + result = max(result, ep_loop_check_proc(ep_tovisit, depth + 1) + 1); + if (result > EP_MAX_NESTS) break; } else { /* @@ -2186,9 +2186,27 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth) list_file(epi->ffd.file); } } + ep->loop_check_depth = result; mutex_unlock(&ep->mtx); - return error; + return result; +} + +/** + * ep_get_upwards_depth_proc - determine depth of @ep when traversed upwards + */ +static int ep_get_upwards_depth_proc(struct eventpoll *ep, int depth) +{ + int result = 0; + struct epitem *epi; + + if (ep->gen == loop_check_gen) + return ep->loop_check_depth; + hlist_for_each_entry_rcu(epi, &ep->refs, fllink) + result = max(result, ep_get_upwards_depth_proc(epi->ep, depth + 1) + 1); + ep->gen = loop_check_gen; + ep->loop_check_depth = result; + return result; } /** @@ -2204,8 +2222,22 @@ static int ep_loop_check_proc(struct eventpoll *ep, int depth) */ static int ep_loop_check(struct eventpoll *ep, struct eventpoll *to) { + int depth, upwards_depth; + inserting_into = ep; - return ep_loop_check_proc(to, 0); + /* + * Check how deep down we can get from @to, and whether it is possible + * to loop up to @ep. + */ + depth = ep_loop_check_proc(to, 0); + if (depth > EP_MAX_NESTS) + return -1; + /* Check how far up we can go from @ep. */ + rcu_read_lock(); + upwards_depth = ep_get_upwards_depth_proc(ep, 0); + rcu_read_unlock(); + + return (depth+1+upwards_depth > EP_MAX_NESTS) ? -1 : 0; } static void clear_tfile_check_list(void) --- base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca change-id: 20250711-epoll-recursion-fix-fb0e336b2aeb -- Jann Horn <jannh(a)google.com>

4 months, 4 weeks

2
3
0 0

[PATCH AUTOSEL 6.6 01/59] usb: xhci: print xhci->xhc_state when queue_command failed

by Sasha Levin

From: Su Hui <suhui(a)nfschina.com> [ Upstream commit 7919407eca2ef562fa6c98c41cfdf6f6cdd69d92 ] When encounters some errors like these: xhci_hcd 0000:4a:00.2: xHCI dying or halted, can't queue_command xhci_hcd 0000:4a:00.2: FIXME: allocate a command ring segment usb usb5-port6: couldn't allocate usb_device It's hard to know whether xhc_state is dying or halted. So it's better to print xhc_state's value which can help locate the resaon of the bug. Signed-off-by: Su Hui <suhui(a)nfschina.com> Link: https://lore.kernel.org/r/20250725060117.1773770-1-suhui@nfschina.com Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- LLM Generated explanations, may be completely bogus: **Backport Status: YES** This commit is suitable for backporting to stable kernel trees for the following reasons: 1. **Enhanced Debugging for Real-World Issues**: The commit improves debugging of USB xHCI host controller failures by printing the actual `xhc_state` value when `queue_command` fails. The commit message shows real error messages users encounter ("xHCI dying or halted, can't queue_command"), demonstrating this is a real-world debugging problem. 2. **Minimal and Safe Change**: The change is extremely small and safe - it only modifies a debug print statement from: ```c xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n"); ``` to: ```c xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n", xhci->xhc_state); ``` 3. **No Functional Changes**: This is a pure diagnostic improvement. It doesn't change any logic, control flow, or data structures. It only adds the state value (0x%x format) to an existing debug message. 4. **Important for Troubleshooting**: The xHCI driver is critical for USB functionality, and when it fails with "dying or halted" states, knowing the exact state helps diagnose whether: - `XHCI_STATE_DYING` (0x1) - controller is dying - `XHCI_STATE_HALTED` (0x2) - controller is halted - Both states (0x3) - controller has both flags set This distinction is valuable for debugging hardware issues, driver bugs, or system problems. 5. **Zero Risk of Regression**: Adding a parameter to a debug print statement has no risk of introducing regressions. The worst case is the debug message prints the state value. 6. **Follows Stable Rules**: This meets stable kernel criteria as it: - Fixes a real debugging limitation - Is obviously correct - Has been tested (signed-off and accepted by Greg KH) - Is small (single line change) - Doesn't add new features, just improves existing diagnostics The commit helps system administrators and developers diagnose USB issues more effectively by providing the actual state value rather than just saying "dying or halted", making it a valuable debugging enhancement for stable kernels. drivers/usb/host/xhci-ring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index 44352df58c9e..c6d89b51c678 100644 --- a/drivers/usb/host/xhci-ring.c +++ b/drivers/usb/host/xhci-ring.c @@ -4454,7 +4454,8 @@ static int queue_command(struct xhci_hcd *xhci, struct xhci_command *cmd, if ((xhci->xhc_state & XHCI_STATE_DYING) || (xhci->xhc_state & XHCI_STATE_HALTED)) { - xhci_dbg(xhci, "xHCI dying or halted, can't queue_command\n"); + xhci_dbg(xhci, "xHCI dying or halted, can't queue_command. state: 0x%x\n", + xhci->xhc_state); return -ESHUTDOWN; } -- 2.39.5

4 months, 4 weeks

2
60
0 0

[PATCH v2] powerpc/mm: Fix SLB multihit issue during SLB preload

by Donet Tom

On systems using the hash MMU, there is a software SLB preload cache that mirrors the entries loaded into the hardware SLB buffer. This preload cache is subject to periodic eviction — typically after every 256 context switches — to remove old entry. To optimize performance, the kernel skips switch_mmu_context() in switch_mm_irqs_off() when the prev and next mm_struct are the same. However, on hash MMU systems, this can lead to inconsistencies between the hardware SLB and the software preload cache. If an SLB entry for a process is evicted from the software cache on one CPU, and the same process later runs on another CPU without executing switch_mmu_context(), the hardware SLB may retain stale entries. If the kernel then attempts to reload that entry, it can trigger an SLB multi-hit error. The following timeline shows how stale SLB entries are created and can cause a multi-hit error when a process moves between CPUs without a MMU context switch. CPU 0 CPU 1 ----- ----- Process P exec swapper/1 load_elf_binary begin_new_exc activate_mm switch_mm_irqs_off switch_mmu_context switch_slb /* * This invalidates all * the entries in the HW * and setup the new HW * SLB entries as per the * preload cache. */ context_switch sched_migrate_task migrates process P to cpu-1 Process swapper/0 context switch (to process P) (uses mm_struct of Process P) switch_mm_irqs_off() switch_slb load_slb++ /* * load_slb becomes 0 here * and we evict an entry from * the preload cache with * preload_age(). We still * keep HW SLB and preload * cache in sync, that is * because all HW SLB entries * anyways gets evicted in * switch_slb during SLBIA. * We then only add those * entries back in HW SLB, * which are currently * present in preload_cache * (after eviction). */ load_elf_binary continues... setup_new_exec() slb_setup_new_exec() sched_switch event sched_migrate_task migrates process P to cpu-0 context_switch from swapper/0 to Process P switch_mm_irqs_off() /* * Since both prev and next mm struct are same we don't call * switch_mmu_context(). This will cause the HW SLB and SW preload * cache to go out of sync in preload_new_slb_context. Because there * was an SLB entry which was evicted from both HW and preload cache * on cpu-1. Now later in preload_new_slb_context(), when we will try * to add the same preload entry again, we will add this to the SW * preload cache and then will add it to the HW SLB. Since on cpu-0 * this entry was never invalidated, hence adding this entry to the HW * SLB will cause a SLB multi-hit error. */ load_elf_binary continues... START_THREAD start_thread preload_new_slb_context /* * This tries to add a new EA to preload cache which was earlier * evicted from both cpu-1 HW SLB and preload cache. This caused the * HW SLB of cpu-0 to go out of sync with the SW preload cache. The * reason for this was, that when we context switched back on CPU-0, * we should have ideally called switch_mmu_context() which will * bring the HW SLB entries on CPU-0 in sync with SW preload cache * entries by setting up the mmu context properly. But we didn't do * that since the prev mm_struct running on cpu-0 was same as the * next mm_struct (which is true for swapper / kernel threads). So * now when we try to add this new entry into the HW SLB of cpu-0, * we hit a SLB multi-hit error. */ WARNING: CPU: 0 PID: 1810970 at arch/powerpc/mm/book3s64/slb.c:62 assert_slb_presence+0x2c/0x50(48 results) 02:47:29 [20157/42149] Modules linked in: CPU: 0 UID: 0 PID: 1810970 Comm: dd Not tainted 6.16.0-rc3-dirty #12 VOLUNTARY Hardware name: IBM pSeries (emulated by qemu) POWER8 (architected) 0x4d0200 0xf000004 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c00000000015426c LR: c0000000001543b4 CTR: 0000000000000000 REGS: c0000000497c77e0 TRAP: 0700 Not tainted (6.16.0-rc3-dirty) MSR: 8000000002823033 <SF,VEC,VSX,FP,ME,IR,DR,RI,LE> CR: 28888482 XER: 00000000 CFAR: c0000000001543b0 IRQMASK: 3 <...> NIP [c00000000015426c] assert_slb_presence+0x2c/0x50 LR [c0000000001543b4] slb_insert_entry+0x124/0x390 Call Trace: 0x7fffceb5ffff (unreliable) preload_new_slb_context+0x100/0x1a0 start_thread+0x26c/0x420 load_elf_binary+0x1b04/0x1c40 bprm_execve+0x358/0x680 do_execveat_common+0x1f8/0x240 sys_execve+0x58/0x70 system_call_exception+0x114/0x300 system_call_common+0x160/0x2c4 To fix this issue, we add a code change to always switch the MMU context on hash MMU if the SLB preload cache has aged. With this change, the SLB multi-hit error no longer occurs. cc: Christophe Leroy <christophe.leroy(a)csgroup.eu> cc: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com> cc: Michael Ellerman <mpe(a)ellerman.id.au> cc: Nicholas Piggin <npiggin(a)gmail.com> Fixes: 5434ae74629a ("powerpc/64s/hash: Add a SLB preload cache") cc: stable(a)vger.kernel.org Suggested-by: Ritesh Harjani (IBM) <ritesh.list(a)gmail.com> Signed-off-by: Donet Tom <donettom(a)linux.ibm.com> --- v1 -> v2 : Changed commit message and added a comment in switch_mm_irqs_off() v1 - https://lore.kernel.org/all/20250731161027.966196-1-donettom@linux.ibm.com/ --- arch/powerpc/mm/book3s64/slb.c | 2 +- arch/powerpc/mm/mmu_context.c | 7 +++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/book3s64/slb.c b/arch/powerpc/mm/book3s64/slb.c index 6b783552403c..08daac3f978c 100644 --- a/arch/powerpc/mm/book3s64/slb.c +++ b/arch/powerpc/mm/book3s64/slb.c @@ -509,7 +509,7 @@ void switch_slb(struct task_struct *tsk, struct mm_struct *mm) * SLB preload cache. */ tsk->thread.load_slb++; - if (!tsk->thread.load_slb) { + if (tsk->thread.load_slb == U8_MAX) { unsigned long pc = KSTK_EIP(tsk); preload_age(ti); diff --git a/arch/powerpc/mm/mmu_context.c b/arch/powerpc/mm/mmu_context.c index 3e3af29b4523..95455d787288 100644 --- a/arch/powerpc/mm/mmu_context.c +++ b/arch/powerpc/mm/mmu_context.c @@ -83,8 +83,11 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, /* Some subarchs need to track the PGD elsewhere */ switch_mm_pgdir(tsk, next); - /* Nothing else to do if we aren't actually switching */ - if (prev == next) + /* + * Nothing else to do if we aren't actually switching and + * the preload slb cache has not aged + */ + if ((prev == next) && (tsk->thread.load_slb != U8_MAX)) return; /* -- 2.50.1

4 months, 4 weeks

4
4
0 0

[PATCH v2] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag

by Hans de Goede

Testing has shown that reading multiple registers at once (for 10 bit adc values) does not work. Set the use_single_read regmap_config flag to make regmap split these for is. This should fix temperature opregion accesses done by drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for the upcoming drivers for the ADC and battery MFD cells. Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC") Cc: stable(a)vger.kernel.org Reviewed-by: Andy Shevchenko <andy(a)kernel.org> Signed-off-by: Hans de Goede <hansg(a)kernel.org> --- Changes in v2: - Update comment to: "The hardware does not support reading multiple registers at once" --- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c index 4c1a68c9f575..6daf33e07ea0 100644 --- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c +++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c @@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = { .reg_bits = 8, .val_bits = 8, .max_register = 0xff, + /* The hardware does not support reading multiple registers at once */ + .use_single_read = true, }; static const struct regmap_irq chtdc_ti_irqs[] = { -- 2.49.0

4 months, 4 weeks

2
1
0 0

[PATCH] mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag

by Hans de Goede

Testing has shown that reading multiple registers at once (for 10 bit adc values) does not work. Set the use_single_read regmap_config flag to make regmap split these for is. This should fix temperature opregion accesses done by drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for the upcoming drivers for the ADC and battery MFD cells. Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC") Cc: stable(a)vger.kernel.org Signed-off-by: Hans de Goede <hansg(a)kernel.org> --- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/mfd/intel_soc_pmic_chtdc_ti.c b/drivers/mfd/intel_soc_pmic_chtdc_ti.c index 4c1a68c9f575..a23bda8ddae8 100644 --- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c +++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c @@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ti_regmap_config = { .reg_bits = 8, .val_bits = 8, .max_register = 0xff, + /* Reading multiple registers at once is not supported */ + .use_single_read = true, }; static const struct regmap_irq chtdc_ti_irqs[] = { -- 2.49.0

4 months, 4 weeks

2
4
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2025