Vasant Hegde hegdevasant@linux.vnet.ibm.com writes:
As per PAPR we have to look for both EPOW sensor value and event modifier to identify type of event and take appropriate action.
Sensor value = 3 (EPOW_SYSTEM_SHUTDOWN) schedule system to be shutdown after OS defined delay (default 10 mins).
EPOW Event Modifier for sensor value = 3: We have to initiate immediate shutdown for most of the event modifier except value = 2 (system running on UPS).
Checking with firmware document its clear that we have to wait for predefined time before initiating shutdown. If power is restored within time we should cancel the shutdown process. I think commit 79872e35 accidently enabled immediate poweroff for EPOW_SHUTDOWN_ON_UPS event.
It's not that clear to me :)
LoPAPR v1.1 section 10.2.2 includes table 136 "EPOW Action Codes":
SYSTEM_SHUTDOWN 3
The system must be shut down. An EPOW-aware OS logs the EPOW error log information, then schedules the system to be shut down to begin after an OS defined delay internal (default is 10 minutes.)
And then in section 10.3.2.2.8 there is table 146 "Platform Event Log Format, Version 6, EPOW Section", which includes the "EPOW Event Modifier":
For EPOW sensor value = 3 0x01 = Normal system shutdown with no additional delay 0x02 = Loss of utility power, system is running on UPS/Battery 0x03 = Loss of system critical functions, system should be shutdown 0x04 = Ambient temperature too high All other values = reserved
There is also section 7.3.6.4 which includes a note saying:
2. The report that a system needs to be shutdown due to running under a UPS would be given by the platform as an EPOW event with EPOW event modifier being given as, 0x02 = Loss of utility power, system is running on UPS/Battery, as described in section Section 10.3.2.2.8‚ “Platform Event Log Format, EPOW Section‚” on page 308.
So the only mention of the 10 minutes is in relation to all SYSTEM_SHUTDOWN events. ie. according to that we should not be doing an immediate shutdown for any of the events.
We have user space tool (rtas_errd) on LPAR to monitor for EPOW_SHUTDOWN_ON_UPS. Once it gets event it initiates shutdown after predefined time. Also starts monitoring for any new EPOW events. If it receives "Power restored" event before predefined time it will cancel the shutdown. Otherwise after predefined time it will shutdown the system.
What event are you referring to as the "Power restored" event? AFAICS PAPR just says we "may" receive an EPOW_RESET.
I can't see anything else about what we're supposed to do if power is restored.
Anyway I'm not opposed to the change, but I don't think it's correct to say that PAPR defines the behaviour.
Rather we used to implement a certain behaviour, and we have at least one customer who relies on that old behaviour and dislikes the new behaviour. It's also generally good to defer decisions like this to userspace, so that administrators can customise the behaviour.
Anyway I'll massage the change log a bit to incorporate some of the above and apply it.
cheers
Fixes: 79872e35 (powerpc/pseries: All events of EPOW_SYSTEM_SHUTDOWN must initiate shutdown) Cc: stable@vger.kernel.org # v4.0+ Cc: Tyrel Datwyler tyreld@linux.ibm.com Cc: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Vasant Hegde hegdevasant@linux.vnet.ibm.com
Changes in v2:
- Updated patch description based on mpe, Tyrel comment.
-Vasant arch/powerpc/platforms/pseries/ras.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index f3736fcd98fc..13c86a292c6d 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -184,7 +184,6 @@ static void handle_system_shutdown(char event_modifier) case EPOW_SHUTDOWN_ON_UPS: pr_emerg("Loss of system power detected. System is running on" " UPS/battery. Check RTAS error log for details\n");
break;orderly_poweroff(true);
case EPOW_SHUTDOWN_LOSS_OF_CRITICAL_FUNCTIONS: -- 2.26.2