On Mon, Aug 17, 2020 at 11:33 AM Raj, Ashok ashok.raj@intel.com wrote:
Hi Evan
Some details below,
On Mon, Aug 17, 2020 at 11:12:17AM -0700, Evan Green wrote:
Hi Ashok, Thank you, Srikanth, and Sukumar for some very impressive debugging here.
On Fri, Aug 14, 2020 at 2:38 PM Ashok Raj ashok.raj@intel.com wrote:
When offlining CPU's, fixup_irqs() migrates all interrupts away from the outgoing CPU to an online CPU. Its always possible the device sent an interrupt to the previous CPU destination. Pending interrupt bit in IRR in lapic identifies such interrupts. apic_soft_disable() will not capture any new interrupts in IRR. This causes interrupts from device to be lost during cpu offline. The issue was found when explicitly setting MSI affinity to a CPU and immediately offlining it. It was simple to recreate with a USB ethernet device and doing I/O to it while the CPU is offlined. Lost interrupts happen even when Interrupt Remapping is enabled.
Current code does apic_soft_disable() before migrating interrupts.
native_cpu_disable() { ... apic_soft_disable(); cpu_disable_common(); --> fixup_irqs(); // Too late to capture anything in IRR. }
Just fliping the above call sequence seems to hit the IRR checks and the lost interrupt is fixed for both legacy MSI and when interrupt remapping is enabled.
Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead") Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/ Signed-off-by: Ashok Raj ashok.raj@intel.com
To: linux-kernel@vger.kernel.org To: Thomas Gleixner tglx@linutronix.de Cc: Sukumar Ghorai sukumar.ghorai@intel.com Cc: Srikanth Nandamuri srikanth.nandamuri@intel.com Cc: Evan Green evgreen@chromium.org Cc: Mathias Nyman mathias.nyman@linux.intel.com Cc: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org
arch/x86/kernel/smpboot.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index ffbd9a3d78d8..278cc9f92f2f 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1603,13 +1603,20 @@ int native_cpu_disable(void) if (ret) return ret;
cpu_disable_common(); /* * Disable the local APIC. Otherwise IPI broadcasts will reach * it. It still responds normally to INIT, NMI, SMI, and SIPI
* messages.
I'm slightly unclear about whether interrupts are disabled at the core by this point or not. I followed native_cpu_disable() up to __cpu_disable(), up to take_cpu_down(). This is passed into a call to stop_machine_cpuslocked(), where interrupts get disabled at the core. So unless there's another path, it seems like interrupts are always disabled at the core by this point.
local_irq_disable() just does cli() which allows interrupts to trickle in to the IRR bits, and once you do sti() things would flow back for normal interrupt processing.
If interrupts are always disabled, then the comment above is a little
Disable interrupts is different from disabling LAPIC. Once you do the apic_soft_disable(), there is nothing flowing into the LAPIC except for INIT, NMI, SMI and SIPI messages.
This turns off the pipe for all other interrupts to enter LAPIC. Which is different from doing a cli().
I understand the distinction. I was mostly musing on the difference in behavior your change causes if this function is entered with interrupts enabled at the core (ie sti()). But I think it never is, so that thought is moot.
I could never repro the issue reliably on comet lake after Thomas' original fix. But I can still repro it easily on jasper lake. And this patch fixes the issue for me on that platform. Thanks for the fix.
Reviewed-by: Evan Green evgreen@chromium.org Tested-by: Evan Green evgreen@chromium.org