On 12/9/2024 1:55 PM, Dmitry Baryshkov wrote:
On Mon, 9 Dec 2024 at 21:54, Abhinav Kumar quic_abhinavk@quicinc.com wrote:
On 12/9/2024 2:04 AM, Dmitry Baryshkov wrote:
During suspend/resume process all connectors are explicitly disabled and then reenabled. However resume fails because of the connector_status check:
[dpu error]connector not connected 3 [drm:drm_mode_config_helper_resume [drm_kms_helper]] *ERROR* Failed to resume (-22)
It doesn't make sense to check for the Writeback connected status (and other drivers don't perform such check), so drop the check.
It wasn't a problem before the commit 71174f362d67 ("drm/msm/dpu: move writeback's atomic_check to dpu_writeback.c"), since encoder's atomic_check() is called under a different conditions that the connector's atomic_check() (e.g. it is not called if there is no connected CRTC or if the corresponding connector is not a part of the new state).
Fixes: 71174f362d67 ("drm/msm/dpu: move writeback's atomic_check to dpu_writeback.c") Cc: stable@vger.kernel.org Reported-by: Leonard Lausen leonard@lausen.nl Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/57 Tested-by: Leonard Lausen leonard@lausen.nl # on sc7180 lazor Reported-by: György Kurucz me@kuruczgy.com Link: https://lore.kernel.org/all/b70a4d1d-f98f-4169-942c-cb9006a42b40@kuruczgy.co... Reported-by: Johan Hovold johan+linaro@kernel.org Link: https://lore.kernel.org/all/ZzyYI8KkWK36FfXf@hovoldconsulting.com/ Tested-by: György Kurucz me@kuruczgy.com Reviewed-by: Johan Hovold johan+linaro@kernel.org Tested-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org
Leonard Lausen reported an issue with suspend/resume of the sc7180 devices. Fix the WB atomic check, which caused the issue.
Changes in v4:
- Epanded commit message (Johan)
- Link to v3: https://lore.kernel.org/r/20241208-dpu-fix-wb-v3-1-a1de69ce4a1b@linaro.org
Changes in v3:
- Rebased on top of msm-fixes
- Link to v2: https://lore.kernel.org/r/20240802-dpu-fix-wb-v2-0-7eac9eb8e895@linaro.org
Changes in v2:
- Reworked the writeback to just drop the connector->status check.
- Expanded commit message for the debugging patch.
- Link to v1: https://lore.kernel.org/r/20240709-dpu-fix-wb-v1-0-448348bfd4cb@linaro.org
drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c index 16f144cbc0c986ee266412223d9e605b01f9fb8c..8ff496082902b1ee713e806140f39b4730ed256a 100644 --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c @@ -42,9 +42,6 @@ static int dpu_wb_conn_atomic_check(struct drm_connector *connector, if (!conn_state || !conn_state->connector) { DPU_ERROR("invalid connector state\n"); return -EINVAL;
} else if (conn_state->connector->status != connector_status_connected) {
DPU_ERROR("connector not connected %d\n", conn_state->connector->status);
return -EINVAL; }
Can you please explain me about why the status was not already connected?
In all the places I checked (unless I missed something), if the detect() callback is not registered, the connector is assumed connected like below:
404 if (funcs->detect_ctx) 405 ret = funcs->detect_ctx(connector, ctx, force); 406 else if (connector->funcs->detect) 407 ret = connector->funcs->detect(connector, force); 408 else 409 ret = connector_status_connected;
We do not register .detect for WB as WB connector should be always connected.
What scenario or use-case is making the connector status to something other than connected?
The places which mark the connector as unknown seem to be covered by warnings in drm_probe_helper.c but the bug report doesnt seem to be hitting those.
Because nobody asks for the getconnector on that connector. For example,drm_client_for_each_connector_iter() explicitly skips WRITEBACK connectors. So, drm_client_modeset_probe() also doesn't request ->fill_modes() on that connector.
I'm almost sure that if somebody runs a `modetest -ac` on the unpatched kernel after boot, there will be no suspend-related issues. In fact, I've just checked on RB5. /sys/class/drm/card0-Writeback-1/status reports 'unknown' before and 'connected' afterwards. You can easily replicate that on your hardware.
Yes this is correct, I just checked on sc7180.
It stays at unknown till we run IGT. This matches your explanation perfectly.
I am wondering if there is some case in fwk resetting this to unknown silently (which is incorrect) and perhaps other drivers dont hit this as there is a .detect registered which always returns connected and MSM should follow that.
111 static enum drm_connector_status 112 komeda_wb_connector_detect(struct drm_connector *connector, bool force) 113 { 114 return connector_status_connected; 115 } 116
No, that won't help. You can add a detect() callback and verify that simply isn't getting called. It's not resetting the connector->status, it's nobody setting it for the first time.
What we found is that drm_atomic_helper_suspend() which calls drm_atomic_helper_duplicate_state(), uses drm_for_each_connector_iter() which does not rely on the last atomic state but actually uses the config->connector_list which in-turn disables all connectors including WB.
Is this expected that even though WB was not really there in the last atomic_state before the suspend, still ended up suspending / resuming and thus exposing this bug?
I am now more convinced of this change as I understand the flow better. But wanted to highlight above observation.
crtc = conn_state->crtc;
base-commit: 86313a9cd152330c634b25d826a281c6a002eb77 change-id: 20240709-dpu-fix-wb-6cd57e3eb182
Best regards,