On Mon, Nov 11 2024 at 17:23, Peter Zijlstra wrote:
On Fri, Nov 08, 2024 at 08:49:31AM -0500, Len Brown wrote:
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 766f092dab80..910cb2d72c13 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1377,6 +1377,9 @@ void smp_kick_mwait_play_dead(void) for (i = 0; READ_ONCE(md->status) != newstate && i < 1000; i++) { /* Bring it out of mwait */ WRITE_ONCE(md->control, newstate);
/* If MONITOR unreliable, send IPI */
if (boot_cpu_has_bug(X86_BUG_MONITOR))
}__apic_send_IPI(cpu, RESCHEDULE_VECTOR); udelay(5);
Going over that code again, mwait_play_dead() is doing __mwait(.exc=0) with IRQs disabled.
And the APIC is shut down. So it won't react on the IPI either.
So that IPI you're trying to send there won't do no nothing :-/
Now that comment there says MCE/NMI/SMI are still open (non-maskable etc.) so perhaps prod it on the NMI vector?
This does seem to suggest the above code path wasn't actually tested.
I'm not sure whether that's just a suggestion :)
Perhaps mark your local machine with BUG_MONITOR, remove the md->control WRITE_ONCE() and try kexec to test it?
Thomas, any other thoughts?
NMI should work. See exc_nmi():
if (arch_cpu_is_offline(smp_processor_id())) { if (microcode_nmi_handler_enabled()) microcode_offline_nmi_handler(); return; }
Thanks,
tglx