On Wed 03 Jun 15:33 PDT 2020, Evan Green wrote:
On Tue, Jun 2, 2020 at 10:29 PM Sibi Sankar sibis@codeaurora.org wrote:
Evan, Thanks for taking time to review the series.
On 2020-06-02 23:14, Evan Green wrote:
On Tue, Jun 2, 2020 at 9:33 AM Sibi Sankar sibis@codeaurora.org wrote:
Sometimes the stop triggers a watchdog rather than a stop-ack. Update the running state to false on requesting stop to skip the watchdog instead.
Error Logs: $ echo stop > /sys/class/remoteproc/remoteproc0/state ipa 1e40000.ipa: received modem stopping event remoteproc-modem: watchdog received: sys_m_smsm_mpss.c:291:APPS force stop qcom-q6v5-mss 4080000.remoteproc-modem: port failed halt ipa 1e40000.ipa: received modem offline event remoteproc0: stopped remote processor 4080000.remoteproc-modem
Fixes: 3b415c8fb263 ("remoteproc: q6v5: Extract common resource handling") Cc: stable@vger.kernel.org Signed-off-by: Sibi Sankar sibis@codeaurora.org
Are you sure you want to tolerate this behavior from MSS? This is a graceful shutdown, modem shouldn't have a problem completing the proper handshake. If they do, isn't that a bug on the modem side?
The graceful shutdown is achieved though sysmon (enabled using CONFIG_QCOM_SYSMON). When sysmon is enabled we get a shutdown-ack when we try to stop the modem, post which request stop is a basically a nop. Request stop is done to force stop the modem during failure cases (like rmtfs is not running and so on) and we do want to mask the wdog that we get during this scenario ( The locking already prevents the servicing of the wdog during shutdown, the check just prevents the scheduling of crash handler and err messages associated with it). Also this check was always present and was missed during common q6v5 resource helper migration, hence the unused running state in mss driver.
So you're saying that the intention of the ->running check already in q6v5_wdog_interrupt() was to allow either the stop-ack or wdog interrupt to complete the stop. This patch just fixes a regression introduced during the refactor. This patch seems ok to me then. It still sort of seems like a bug that the modem responds arbitrarily in one of two ways, even to a "harsh" shutdown request.
I think the patch properly fixes this regression, but I share your concern about the fact that we omit an entire category of shutdown-related crashes from being reported.
The problem I presume with the current behavior is that the shutdown will race against the crash handler - which might boot the remoteproc up again while the shutdown is progressing.
So I'm okay with this fix for the immediate problem, but think it would be nice if we could report the issue appropriately and then finalize the shutdown.
I wasn't aware of QCOM_SYSMON. Reading it now, It seems like kind of a lot... do I really need all this? Can I get by with just remoteproc stops?
It used to be that we set one bit in shared memory and sent an interrupt and the modem would set another bit and interrupt us back when it was done shutting down...
Anyway, for this patch:
Reviewed-by: Evan Green evgreen@chromium.org
Thanks, Bjorn