Hi,
On Tue, Feb 11, 2020 at 01:28:58AM -0600, Samuel Holland wrote:
The driver currently uses runtime PM to perform some of the module initialization and cleanup. This has three problems:
- There is no Kconfig dependency on CONFIG_PM, so if runtime PM is disabled, the driver will not work at all, since the module will never be initialized.
That's fairly easy to fix.
The driver does not ensure that the device is suspended when sun6i_dsi_probe() fails or when sun6i_dsi_remove() is called. It simply disables runtime PM. From the docs of pm_runtime_disable():
The device can be either active or suspended after its runtime PM has been disabled.
And indeed, the device will likely still be active if sun6i_dsi_probe fails. For example, if the panel driver is not yet loaded, we have the following sequence:
sun6i_dsi_probe() pm_runtime_enable() mipi_dsi_host_register() of_mipi_dsi_device_add(child) ...device_add()... __device_attach() pm_runtime_get_sync(dev->parent) -> Causes resume bus_for_each_drv() __device_attach_driver() -> No match for panel pm_runtime_put(dev->parent) -> Async idle request component_add() __component_add() try_to_bring_up_masters() try_to_bring_up_master() sun4i_drv_bind() component_bind_all() component_bind() sun6i_dsi_bind() -> Fails with -EPROBE_DEFER mipi_dsi_host_unregister() pm_runtime_disable() __pm_runtime_disable() __pm_runtime_barrier() -> Idle request is still pending cancel_work_sync() -> DSI host is *not* suspended!
Since the device is not suspended, the clock and regulator are never disabled. The imbalance causes a WARN at devres free time.
That's interesting. I guess this is shown when you have the panel as a module?
There's something pretty weird though. The comment in __pm_runtime_disable states that it will "wait for all operations in progress to complete" so at the end of __pm_runtime_disable call, the DSI host will be suspended and we shouldn't have a WARN at all.
The driver relies on being suspended when sun6i_dsi_encoder_enable() is called. The resume callback has a comment that says:
Some part of it can only be done once we get a number of lanes, see sun6i_dsi_inst_init
And then part of the resume callback only runs if dsi->device is not NULL (that is, if sun6i_dsi_attach() has been called). However, as the above call graph shows, the resume callback is guaranteed to be called before sun6i_dsi_attach(); it is called before child devices get their drivers attached.
Isn't it something that has been changed by your previous patch though?
Therefore, part of the controller initialization will only run if the device is suspended between the calls to mipi_dsi_host_register() and component_add() (which ends up calling sun6i_dsi_encoder_enable()). Again, as shown by the above call graph, this is not the case. It appears that the controller happens to work because it is still initialized by the bootloader.
We don't have any bootloader support for MIPI-DSI, so no, that's not it.
Because the connector is hardcoded to always be connected, the device's runtime PM reference is not dropped until system suspend, when sun4i_drv_drm_sys_suspend() ends up calling sun6i_dsi_encoder_disable(). However, that is done as a system sleep PM hook, and at that point the system PM core has already taken another runtime PM reference, so sun6i_dsi_runtime_suspend() is not called. Likewise, by the time the PM core releases its reference, sun4i_drv_drm_sys_resume() has already re-enabled the encoder.
So after system suspend and resume, we have *still never called* sun6i_dsi_inst_init(), and now that the rest of the display pipeline has been reset, the DSI host is unable to communicate with the panel, causing VBLANK timeouts.
Either way, I guess just moving the pm_runtime_enable call to sun6i_dsi_attach will fix this, right? We don't really need to have the DSI controller powered up before that time anyway.
Fix all of these issues by inlining the runtime PM hooks into the encoder enable/disable functions, which are guaranteed to run after a panel is attached. This allows sun6i_dsi_inst_init() to be called unconditionally. Furthermore, this causes the hardware to be turned off during system suspend and reinitialized on resume, which was not happening before.
That's not something we should do really. We're really lacking any power management, so we should be having more of runtime_pm, not less.
Maxime