Turns out we are sending a lot more hotplug events then we need, and this is causing some pretty serious issues. Currently, we call intel_dp_mst_resume() in i915_drm_resume() well before we have any sort of hotplugging setup. This is a pretty big problem, because in practice it will generally result in throwing the power domain refcounts out of wack.
For instance: On my T480s, removing a previously connected topology before the system finishes resuming causes drm_kms_helper_hotplug_event() to be called before HPD is setup again, which causes us to do a connector reprobe, which then causes intel_dp_detect() to be called on all DP devices -including- the eDP display. From there, intel_dp_detect() is run on the eDP display which triggers DPCD transactions. Those DPCD transactions then cause us to call edp_panel_vdd_on(), which then causes us to grab an additional wakeref to the relevant power wells (PORT_DDI_A_IO on this machine).
From there, this wakeref is never released which then causes the next
suspend/resume cycle to entirely fail due to the hardware not being powered off correctly.
This sucks really badly, and I don't see any decent way to actually fix this in intel_dp_detect() easily. Additionally, I don't even think it'd be worth the time now since we're not expecting to handle any kind of connector reprobing at the point in which we call intel_dp_mst_resume(), but we also can't move intel_dp_mst_resume() any higher in the resume process since MST topologies need to be resumed before intel_display_resume() is called.
However, there's a light at the end of the tunnel! After reading through a lot of code dozens of times, it occurred to me that we -never- actually need to send hotplug events when calling drm_dp_mst_topology_mgr_set_mst() since we send hotplug events in drm_dp_destroy_connector_work(). Imagine that!
So, since we only seem to call intel_dp_mst_check_status() to disable MST on the encoder in question and then send a hotplug, get rid of this and instead just disable MST mode when a hub fails in intel_dp_mst_resume(). From there, drm_dp_destroy_connector_work() will eventually send the hotplug event.
Signed-off-by: Lyude Paul lyude@redhat.com Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)") Cc: Todd Previte tprevite@gmail.com Cc: Dave Airlie airlied@redhat.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org # v3.17+ --- drivers/gpu/drm/i915/intel_dp.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 681e88405ada..c2399acf177b 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -7096,7 +7096,10 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv) continue;
ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr); - if (ret) - intel_dp_check_mst_status(intel_dp); + if (ret) { + intel_dp->is_mst = false; + drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr, + false); + } } }
On Fri, Jan 25, 2019 at 08:24:35PM -0500, Lyude Paul wrote:
Turns out we are sending a lot more hotplug events then we need, and this is causing some pretty serious issues. Currently, we call intel_dp_mst_resume() in i915_drm_resume() well before we have any sort of hotplugging setup.
We call hpd_irq_setup() before calling intel_dp_mst_resume(). The only purpose of that part (lifted out from intel_hpd_init()) is to provide the short HPD interrupt functionality MST AUX transfers need.
But you are right in that - as a side-effect - we'll also enable generic hotplug functionality that is independent of the above MST requirement. Doing that kind of generic hotplug processing before intel_display_resume() is probably not a good idea, it can interfere at least with the mode restore in __intel_display_resume().
This is a pretty big problem, because in practice it will generally result in throwing the power domain refcounts out of wack.
For instance: On my T480s, removing a previously connected topology before the system finishes resuming causes drm_kms_helper_hotplug_event() to be called before HPD is setup again, which causes us to do a connector reprobe, which then causes intel_dp_detect() to be called on all DP devices -including- the eDP display. From there, intel_dp_detect() is run on the eDP display which triggers DPCD transactions. Those DPCD transactions then cause us to call edp_panel_vdd_on(), which then causes us to grab an additional wakeref to the relevant power wells (PORT_DDI_A_IO on this machine). From there, this wakeref is never released which then causes the next suspend/resume cycle to entirely fail due to the hardware not being powered off correctly.
This sucks really badly, and I don't see any decent way to actually fix this in intel_dp_detect() easily. Additionally, I don't even think it'd be worth the time now since we're not expecting to handle any kind of connector reprobing at the point in which we call intel_dp_mst_resume(), but we also can't move intel_dp_mst_resume() any higher in the resume process since MST topologies need to be resumed before intel_display_resume() is called.
However, there's a light at the end of the tunnel! After reading through a lot of code dozens of times, it occurred to me that we -never- actually need to send hotplug events when calling drm_dp_mst_topology_mgr_set_mst() since we send hotplug events in drm_dp_destroy_connector_work(). Imagine that!
So, since we only seem to call intel_dp_mst_check_status() to disable MST on the encoder in question and then send a hotplug, get rid of this and instead just disable MST mode when a hub fails in intel_dp_mst_resume(). From there, drm_dp_destroy_connector_work() will eventually send the hotplug event.
Signed-off-by: Lyude Paul lyude@redhat.com Fixes: 0e32b39ceed6 ("drm/i915: add DP 1.2 MST support (v0.7)") Cc: Todd Previte tprevite@gmail.com Cc: Dave Airlie airlied@redhat.com Cc: Jani Nikula jani.nikula@linux.intel.com Cc: Joonas Lahtinen joonas.lahtinen@linux.intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: intel-gfx@lists.freedesktop.org Cc: stable@vger.kernel.org # v3.17+
Not knowing enough about the MST code, but we do need to prevent generic hotplug processing at this point:
Acked-by: Imre Deak imre.deak@intel.com
drivers/gpu/drm/i915/intel_dp.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c index 681e88405ada..c2399acf177b 100644 --- a/drivers/gpu/drm/i915/intel_dp.c +++ b/drivers/gpu/drm/i915/intel_dp.c @@ -7096,7 +7096,10 @@ void intel_dp_mst_resume(struct drm_i915_private *dev_priv) continue; ret = drm_dp_mst_topology_mgr_resume(&intel_dp->mst_mgr);
if (ret)
intel_dp_check_mst_status(intel_dp);
if (ret) {
intel_dp->is_mst = false;
drm_dp_mst_topology_mgr_set_mst(&intel_dp->mst_mgr,
false);
}}
}
2.20.1
Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
linux-stable-mirror@lists.linaro.org