When offlining CPU's, fixup_irqs() migrates all interrupts away from the
outgoing CPU to an online CPU. Its always possible the device sent an
interrupt to the previous CPU destination. Pending interrupt bit in IRR in
lapic identifies such interrupts. apic_soft_disable() will not capture any
new interrupts in IRR. This causes interrupts from device to be lost during
cpu offline. The issue was found when explicitly setting MSI affinity to a
CPU and immediately offlining it. It was simple to recreate with a USB
ethernet device and doing I/O to it while the CPU is offlined. Lost
interrupts happen even when Interrupt Remapping is enabled.
Current code does apic_soft_disable() before migrating interrupts.
native_cpu_disable()
{
...
apic_soft_disable();
cpu_disable_common();
--> fixup_irqs(); // Too late to capture anything in IRR.
}
Just fliping the above call sequence seems to hit the IRR checks
and the lost interrupt is fixed for both legacy MSI and when
interrupt remapping is enabled.
Fixes: 60dcaad5736f ("x86/hotplug: Silence APIC and NMI when CPU is dead")
Link: https://lore.kernel.org/lkml/875zdarr4h.fsf@nanos.tec.linutronix.de/
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
To: linux-kernel(a)vger.kernel.org
To: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Sukumar Ghorai <sukumar.ghorai(a)intel.com>
Cc: Srikanth Nandamuri <srikanth.nandamuri(a)intel.com>
Cc: Evan Green <evgreen(a)chromium.org>
Cc: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas(a)google.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/kernel/smpboot.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ffbd9a3d78d8..278cc9f92f2f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1603,13 +1603,20 @@ int native_cpu_disable(void)
if (ret)
return ret;
+ cpu_disable_common();
/*
* Disable the local APIC. Otherwise IPI broadcasts will reach
* it. It still responds normally to INIT, NMI, SMI, and SIPI
- * messages.
+ * messages. Its important to do apic_soft_disable() after
+ * fixup_irqs(), because fixup_irqs() called from cpu_disable_common()
+ * depends on IRR being set. After apic_soft_disable() CPU preserves
+ * currently set IRR/ISR but new interrupts will not set IRR.
+ * This causes interrupts sent to outgoing cpu before completion
+ * of irq migration to be lost. Check SDM Vol 3 "10.4.7.2 Local
+ * APIC State after It Has been Software Disabled" section for more
+ * details.
*/
apic_soft_disable();
- cpu_disable_common();
return 0;
}
--
2.13.6
As per PAPR we have to look for both EPOW sensor value and event modifier to
identify type of event and take appropriate action.
Sensor value = 3 (EPOW_SYSTEM_SHUTDOWN) schedule system to be shutdown after
OS defined delay (default 10 mins).
EPOW Event Modifier for sensor value = 3:
We have to initiate immediate shutdown for most of the event modifier except
value = 2 (system running on UPS).
Checking with firmware document its clear that we have to wait for predefined
time before initiating shutdown. If power is restored within time we should
cancel the shutdown process. I think commit 79872e35 accidently enabled
immediate poweroff for EPOW_SHUTDOWN_ON_UPS event.
We have user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS.
Once it gets event it initiates shutdown after predefined time. Also starts
monitoring for any new EPOW events. If it receives "Power restored" event
before predefined time it will cancel the shutdown. Otherwise after
predefined time it will shutdown the system.
Fixes: 79872e35 (powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown)
Cc: stable(a)vger.kernel.org # v4.0+
Cc: Tyrel Datwyler <tyreld(a)linux.ibm.com>
Cc: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Vasant Hegde <hegdevasant(a)linux.vnet.ibm.com>
---
Changes in v2:
- Updated patch description based on mpe, Tyrel comment.
-Vasant
arch/powerpc/platforms/pseries/ras.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index f3736fcd98fc..13c86a292c6d 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -184,7 +184,6 @@ static void handle_system_shutdown(char event_modifier)
case EPOW_SHUTDOWN_ON_UPS:
pr_emerg("Loss of system power detected. System is running on"
" UPS/battery. Check RTAS error log for details\n");
- orderly_poweroff(true);
break;
case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS:
--
2.26.2