From: Oleg Nesterov oleg@redhat.com
[ Upstream commit c7b4133c48445dde789ed30b19ccb0448c7593f7 ]
1. Clear utask->xol_vaddr unconditionally, even if this addr is not valid, xol_free_insn_slot() should never return with utask->xol_vaddr != NULL.
2. Add a comment to explain why do we need to validate slot_addr.
3. Simplify the validation above. We can simply check offset < PAGE_SIZE, unsigned underflows are fine, it should work if slot_addr < area->vaddr.
4. Kill the unnecessary "slot_nr >= UINSNS_PER_PAGE" check, slot_nr must be valid if offset < PAGE_SIZE.
The next patches will cleanup this function even more.
Signed-off-by: Oleg Nesterov oleg@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20240929144235.GA9471@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/uprobes.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 9ee25351cecac..e3c616085137e 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1632,8 +1632,8 @@ static unsigned long xol_get_insn_slot(struct uprobe *uprobe) static void xol_free_insn_slot(struct task_struct *tsk) { struct xol_area *area; - unsigned long vma_end; unsigned long slot_addr; + unsigned long offset;
if (!tsk->mm || !tsk->mm->uprobes_state.xol_area || !tsk->utask) return; @@ -1642,24 +1642,21 @@ static void xol_free_insn_slot(struct task_struct *tsk) if (unlikely(!slot_addr)) return;
+ tsk->utask->xol_vaddr = 0; area = tsk->mm->uprobes_state.xol_area; - vma_end = area->vaddr + PAGE_SIZE; - if (area->vaddr <= slot_addr && slot_addr < vma_end) { - unsigned long offset; - int slot_nr; - - offset = slot_addr - area->vaddr; - slot_nr = offset / UPROBE_XOL_SLOT_BYTES; - if (slot_nr >= UINSNS_PER_PAGE) - return; + offset = slot_addr - area->vaddr; + /* + * slot_addr must fit into [area->vaddr, area->vaddr + PAGE_SIZE). + * This check can only fail if the "[uprobes]" vma was mremap'ed. + */ + if (offset < PAGE_SIZE) { + int slot_nr = offset / UPROBE_XOL_SLOT_BYTES;
clear_bit(slot_nr, area->bitmap); atomic_dec(&area->slot_count); smp_mb__after_atomic(); /* pairs with prepare_to_wait() */ if (waitqueue_active(&area->wq)) wake_up(&area->wq); - - tsk->utask->xol_vaddr = 0; } }
From: Breno Leitao leitao@debian.org
[ Upstream commit de20037e1b3c2f2ca97b8c12b8c7bca8abd509a7 ]
Warning at every leaking bits can cause a flood of message, triggering various stall-warning mechanisms to fire, including CSD locks, which makes the machine to be unusable.
Track the bits that are being leaked, and only warn when a new bit is set.
That said, this patch will help with the following issues:
1) It will tell us which bits are being set, so, it is easy to communicate it back to vendor, and to do a root-cause analyzes.
2) It avoid the machine to be unusable, because, worst case scenario, the user gets less than 60 WARNs (one per unhandled bit).
Suggested-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Breno Leitao leitao@debian.org Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Reviewed-by: Sandipan Das sandipan.das@amd.com Reviewed-by: Paul E. McKenney paulmck@kernel.org Link: https://lkml.kernel.org/r/20241001141020.2620361-1-leitao@debian.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/events/amd/core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c index 3ac069a4559b0..1282f1a702139 100644 --- a/arch/x86/events/amd/core.c +++ b/arch/x86/events/amd/core.c @@ -881,11 +881,12 @@ static int amd_pmu_handle_irq(struct pt_regs *regs) static int amd_pmu_v2_handle_irq(struct pt_regs *regs) { struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events); + static atomic64_t status_warned = ATOMIC64_INIT(0); + u64 reserved, status, mask, new_bits, prev_bits; struct perf_sample_data data; struct hw_perf_event *hwc; struct perf_event *event; int handled = 0, idx; - u64 reserved, status, mask; bool pmu_enabled;
/* @@ -952,7 +953,12 @@ static int amd_pmu_v2_handle_irq(struct pt_regs *regs) * the corresponding PMCs are expected to be inactive according to the * active_mask */ - WARN_ON(status > 0); + if (status > 0) { + prev_bits = atomic64_fetch_or(status, &status_warned); + // A new bit was set for the very first time. + new_bits = status & ~prev_bits; + WARN(new_bits, "New overflows for inactive PMCs: %llx\n", new_bits); + }
/* Clear overflow and freeze bits */ amd_pmu_ack_global_status(~status);
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit c163e40af9b2331b2c629fd4ec8b703ed4d4ae39 ]
clocksource_delta() has two variants. One with a check for negative motion, which is only selected by x86. This is a historic leftover as this function was previously used in the time getter hot paths.
Since 135225a363ae timekeeping_cycles_to_ns() has unconditional protection against this as a by-product of the protection against 64bit math overflow.
clocksource_delta() is only used in the clocksource watchdog and in timekeeping_advance(). The extra conditional there is not hurting anyone.
Remove the config option and unconditionally prevent negative motion of the readout.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: John Stultz jstultz@google.com Link: https://lore.kernel.org/all/20241031120328.599430157@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/Kconfig | 1 - kernel/time/Kconfig | 5 ----- kernel/time/timekeeping_internal.h | 7 ------- 3 files changed, 13 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 49cea5b81649d..c75e18346c89b 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -131,7 +131,6 @@ config X86 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT select CLKEVT_I8253 - select CLOCKSOURCE_VALIDATE_LAST_CYCLE select CLOCKSOURCE_WATCHDOG # Word-size accesses may read uninitialized data past the trailing \0 # in strings and cause false KMSAN reports. diff --git a/kernel/time/Kconfig b/kernel/time/Kconfig index a41753be1a2bf..6ce5b66643f0b 100644 --- a/kernel/time/Kconfig +++ b/kernel/time/Kconfig @@ -17,11 +17,6 @@ config ARCH_CLOCKSOURCE_DATA config ARCH_CLOCKSOURCE_INIT bool
-# Clocksources require validation of the clocksource against the last -# cycle update - x86/TSC misfeature -config CLOCKSOURCE_VALIDATE_LAST_CYCLE - bool - # Timekeeping vsyscall support config GENERIC_TIME_VSYSCALL bool diff --git a/kernel/time/timekeeping_internal.h b/kernel/time/timekeeping_internal.h index 4ca2787d1642e..1d4854d5c386e 100644 --- a/kernel/time/timekeeping_internal.h +++ b/kernel/time/timekeeping_internal.h @@ -15,7 +15,6 @@ extern void tk_debug_account_sleep_time(const struct timespec64 *t); #define tk_debug_account_sleep_time(x) #endif
-#ifdef CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE static inline u64 clocksource_delta(u64 now, u64 last, u64 mask) { u64 ret = (now - last) & mask; @@ -26,12 +25,6 @@ static inline u64 clocksource_delta(u64 now, u64 last, u64 mask) */ return ret & ~(mask >> 1) ? 0 : ret; } -#else -static inline u64 clocksource_delta(u64 now, u64 last, u64 mask) -{ - return (now - last) & mask; -} -#endif
/* Semi public for serialization of non timekeeper VDSO updates. */ extern raw_spinlock_t timekeeper_lock;
linux-stable-mirror@lists.linaro.org