From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we leave the cache_level of the initial fb obj set to NONE. This means on eLLC machines the first pin_to_display() will try to switch it to WT which requires a vma unbind+bind. If that happens during the fbdev initialization rcu does not seem operational which causes the unbind to get stuck. To most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level as WT when creating it. We still do an excplicit ggtt pin which will rewrite the PTEs anyway, so they will match whatever cache level we set.
Cc: stable@vger.kernel.org # v5.7+ Suggested-by: Chris Wilson chris@chris-wilson.co.uk Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com --- drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 907e1d155443..00c08600c60a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915, if (IS_ERR(obj)) return NULL;
+ /* + * Mark it WT ahead of time to avoid changing the + * cache_level during fbdev initialization. The + * unbind there would get stuck waiting for rcu. + */ + i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ? + I915_CACHE_WT : I915_CACHE_NONE); + switch (plane_config->tiling) { case I915_TILING_NONE: break;
See subject, s/ininitial/iniital/
Quoting Ville Syrjala (2020-10-07 13:03:27)
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we leave the cache_level of the initial fb obj set to NONE. This means on eLLC machines the first pin_to_display() will try to switch it to WT which requires a vma unbind+bind. If that happens during the fbdev initialization rcu does not seem operational which causes the unbind to get stuck. To most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level as WT when creating it. We still do an excplicit ggtt pin which will rewrite the PTEs anyway, so they will match whatever cache level we set.
Cc: stable@vger.kernel.org # v5.7+ Suggested-by: Chris Wilson chris@chris-wilson.co.uk Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 907e1d155443..00c08600c60a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915, if (IS_ERR(obj)) return NULL;
/*
* Mark it WT ahead of time to avoid changing the
* cache_level during fbdev initialization. The
* unbind there would get stuck waiting for rcu.
*/
i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
I915_CACHE_WT : I915_CACHE_NONE);
Ok, I've been worrying about whether there were any more side-effects, but I think it all comes out in the wash. The proof is definitely in the eating, and we will know soon enough if we break someone's virtual terminal.
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk -Chris
On Tue, Oct 13, 2020 at 04:47:49PM +0100, Chris Wilson wrote:
See subject, s/ininitial/iniital/
Quoting Ville Syrjala (2020-10-07 13:03:27)
From: Ville Syrjälä ville.syrjala@linux.intel.com
Currently we leave the cache_level of the initial fb obj set to NONE. This means on eLLC machines the first pin_to_display() will try to switch it to WT which requires a vma unbind+bind. If that happens during the fbdev initialization rcu does not seem operational which causes the unbind to get stuck. To most appearances this looks like a dead machine on boot.
Avoid the unbind by already marking the object cache_level as WT when creating it. We still do an excplicit ggtt pin which will rewrite the PTEs anyway, so they will match whatever cache level we set.
Cc: stable@vger.kernel.org # v5.7+ Suggested-by: Chris Wilson chris@chris-wilson.co.uk Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2381 Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com
drivers/gpu/drm/i915/display/intel_display.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c index 907e1d155443..00c08600c60a 100644 --- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -3445,6 +3445,14 @@ initial_plane_vma(struct drm_i915_private *i915, if (IS_ERR(obj)) return NULL;
/*
* Mark it WT ahead of time to avoid changing the
* cache_level during fbdev initialization. The
* unbind there would get stuck waiting for rcu.
*/
i915_gem_object_set_cache_coherency(obj, HAS_WT(i915) ?
I915_CACHE_WT : I915_CACHE_NONE);
Ok, I've been worrying about whether there were any more side-effects, but I think it all comes out in the wash. The proof is definitely in the eating, and we will know soon enough if we break someone's virtual terminal.
At least it seems to work on my CFL with eLLC caching enabled.
Reviewed-by: Chris Wilson chris@chris-wilson.co.uk
Ta.
linux-stable-mirror@lists.linaro.org