Ashok,
On Thu, Aug 20 2020 at 17:42, Ashok Raj wrote:
When offlining CPUs, fixup_irqs() migrates all interrupts away from the outgoing CPU to an online CPU. It's always possible the device sent an interrupt to the previous CPU destination. Pending interrupt bit in IRR in LAPIC identifies such interrupts. apic_soft_disable() will not capture any new interrupts in IRR. This causes interrupts from device to be lost during CPU offline. The issue was found when explicitly setting MSI affinity to a CPU and immediately offlining it. It was simple to recreate with a USB ethernet device and doing I/O to it while the CPU is offlined. Lost interrupts happen even when Interrupt Remapping is enabled.
New lines exist for a reason. They help to structure information. For the content, please see below.
Current code does apic_soft_disable() before migrating interrupts.
native_cpu_disable() { ... apic_soft_disable(); cpu_disable_common(); --> fixup_irqs(); // Too late to capture anything in IRR. }
Just flipping the above call sequence seems to hit the IRR checks and the lost interrupt is fixed for both legacy MSI and when interrupt remapping is enabled.
Seems to hit? Come on, we really want changelogs which are based on facts and not on assumptions.
Aside of that, yes that's a really subtle one and thanks for tracking it down! For some reason I never looked at that ordering, but now that you stick it in front of me, it's pretty clear that this is the root cause.
/* * Disable the local APIC. Otherwise IPI broadcasts will reach * it. It still responds normally to INIT, NMI, SMI, and SIPI
* messages.
* messages. It's important to do apic_soft_disable() after
* fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
* depends on IRR being set.
That sentence does not make sense to me.
.... After apic_soft_disable() CPU preserves
* currently set IRR/ISR but new interrupts will not set IRR.
I agree with the IRR part, but ISR is simply impossible to be set in this situation.
* This causes interrupts sent to outgoing CPU before completion
* of IRQ migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
* APIC State after It Has been Software Disabled" section for more
* details.
Please do not use the SDM chapter number of today. It's going to be a different one with the next version.
Something like this perhaps?
/* * Disable the local APIC. Otherwise IPI broadcasts will reach * it. It still responds normally to INIT, NMI, SMI, and SIPI * messages. * * Disabling the APIC must happen after cpu_disable_common() * which invokes fixup_irqs(). * * Disabling the APIC preserves already set bits in IRR, but * an interrupt arriving after disabling the local APIC does not * set the corresponding IRR bit. * * fixup_irqs() scans IRR for set bits so it can raise a not * yet handled interrupt on the new destination CPU via an IPI * but obviously it can't do so for IRR bits which are not set. * IOW, interrupts arriving after disabling the local APIC will * be lost. */
Hmm?
The changelog wants to have a corresponding update.
Thanks,
tglx