Hi Maxime,
On 2/11/20 2:26 AM, Maxime Ripard wrote:
On Tue, Feb 11, 2020 at 01:28:58AM -0600, Samuel Holland wrote:
The driver currently uses runtime PM to perform some of the module initialization and cleanup. This has three problems:
- There is no Kconfig dependency on CONFIG_PM, so if runtime PM is disabled, the driver will not work at all, since the module will never be initialized.
That's fairly easy to fix.
The driver does not ensure that the device is suspended when sun6i_dsi_probe() fails or when sun6i_dsi_remove() is called. It simply disables runtime PM. From the docs of pm_runtime_disable():
The device can be either active or suspended after its runtime PM has been disabled.
And indeed, the device will likely still be active if sun6i_dsi_probe fails. For example, if the panel driver is not yet loaded, we have the following sequence:
sun6i_dsi_probe() pm_runtime_enable() mipi_dsi_host_register() of_mipi_dsi_device_add(child) ...device_add()... __device_attach() pm_runtime_get_sync(dev->parent) -> Causes resume bus_for_each_drv() __device_attach_driver() -> No match for panel pm_runtime_put(dev->parent) -> Async idle request component_add() __component_add() try_to_bring_up_masters() try_to_bring_up_master() sun4i_drv_bind() component_bind_all() component_bind() sun6i_dsi_bind() -> Fails with -EPROBE_DEFER mipi_dsi_host_unregister() pm_runtime_disable() __pm_runtime_disable() __pm_runtime_barrier() -> Idle request is still pending cancel_work_sync() -> DSI host is *not* suspended!
Since the device is not suspended, the clock and regulator are never disabled. The imbalance causes a WARN at devres free time.
That's interesting. I guess this is shown when you have the panel as a module?
That's the easiest way to get sun6i_dsi_probe() to fail, yes. Even if the panel was built-in `modprobe sun6i_dsi; rmmod sun6i_dsi` would likely trigger the issue, since sun6i_dsi_remove() has the same problem.
There's something pretty weird though. The comment in __pm_runtime_disable states that it will "wait for all operations in progress to complete" so at the end of __pm_runtime_disable call, the DSI host will be suspended and we shouldn't have a WARN at all.
No, that's not what "operations in progress" means. That only waits for a callback that is *already running* on another CPU to complete, in other words `dev->power.runtime_status == RPM_SUSPENDING`.
Here the callback does not get run at all. At the time __pm_runtime_disable() is called:
dev->power.runtime_status == RPM_ACTIVE dev->power.request == RPM_REQ_IDLE dev->power.request_pending == true
because pm_runtime_put() calls rpm_idle() with the RPM_ASYNC flag.
And as I mentioned, that request is thrown away by __pm_runtime_barrier(). So the device PM core is working as documented.
The driver relies on being suspended when sun6i_dsi_encoder_enable() is called. The resume callback has a comment that says:
Some part of it can only be done once we get a number of lanes, see sun6i_dsi_inst_init
And then part of the resume callback only runs if dsi->device is not NULL (that is, if sun6i_dsi_attach() has been called). However, as the above call graph shows, the resume callback is guaranteed to be called before sun6i_dsi_attach(); it is called before child devices get their drivers attached.
Isn't it something that has been changed by your previous patch though?
No. Before the previous patch, sun6i_dsi_bind() requires sun6i_dsi_attach() to have been called first. So either the panel driver is not loaded, and issue #2 happens, or the panel driver is loaded, and you get the following modification to the above call graph:
mipi_dsi_host_register() ... __device_attach() pm_runtime_get_sync(dev->parent) -> Causes resume bus_for_each_drv() __device_attach_driver() [panel probe function] mipi_dsi_attach() sun6i_dsi_attach() pm_runtime_put(dev->parent) -> Async idle request component_add() ... sun6i_dsi_bind() ... sun6i_dsi_encoder_enable() pm_runtime_get_sync() -> Cancels idle request
And because `dev->power.runtime_status == RPM_ACTIVE` still, the callback is *not* run. Either way you have the same problem.
Therefore, part of the controller initialization will only run if the device is suspended between the calls to mipi_dsi_host_register() and component_add() (which ends up calling sun6i_dsi_encoder_enable()). Again, as shown by the above call graph, this is not the case. It appears that the controller happens to work because it is still initialized by the bootloader.
We don't have any bootloader support for MIPI-DSI, so no, that's not it.
Because the connector is hardcoded to always be connected, the device's runtime PM reference is not dropped until system suspend, when sun4i_drv_drm_sys_suspend() ends up calling sun6i_dsi_encoder_disable(). However, that is done as a system sleep PM hook, and at that point the system PM core has already taken another runtime PM reference, so sun6i_dsi_runtime_suspend() is not called. Likewise, by the time the PM core releases its reference, sun4i_drv_drm_sys_resume() has already re-enabled the encoder.
So after system suspend and resume, we have *still never called* sun6i_dsi_inst_init(), and now that the rest of the display pipeline has been reset, the DSI host is unable to communicate with the panel, causing VBLANK timeouts.
Either way, I guess just moving the pm_runtime_enable call to sun6i_dsi_attach will fix this, right? We don't really need to have the DSI controller powered up before that time anyway.
Sorry, but no again. It would solve issue #2 (only if the previous patch is applied), but not issue #3.
Regardless of when runtime PM is enabled, sun6i_dsi_runtime_suspend() will not be called until the device's usage count drops to 0. And as long as a panel is bound, the controller's usage count will be >0, *even during system suspend* while the encoder is turned off.
Before the previous patch, the usage count would never drop to 0 under *any* circumstance.
Fix all of these issues by inlining the runtime PM hooks into the encoder enable/disable functions, which are guaranteed to run after a panel is attached. This allows sun6i_dsi_inst_init() to be called unconditionally. Furthermore, this causes the hardware to be turned off during system suspend and reinitialized on resume, which was not happening before.
That's not something we should do really. We're really lacking any power management, so we should be having more of runtime_pm, not less.
This *is* adding more power management! The current runtime_pm hooks never actually suspend the device, as described above. And even if they did work, this would not extend the lifetime during which the device is active! I'm calling the power-up/power-down routines at exactly the same point they were previously getting called, except for two changes:
1) The device does not get powered up during mipi_dsi_host_register(), which you just said was unnecessary.
2) The code in the PM hooks actually gets run when it was intended to be run.
Maxime
Samuel