Hi,
This series of two patches fixes the issue introduced in cf586021642d80 ("drm/i915/gt: Pipelined page migration") where, as reported by Matt, in a chain of requests an error is reported only if happens in the last request.
However Chris noticed that without ensuring exclusivity in the locking we might end up in some deadlock. That's why patch 1 throttles for the ringspace in order to make sure that no one is holding it.
Version 1 of this patch has been reviewed by matt and this version is adding Chris exclusive locking.
Thanks Chris for this work.
Andi
Changelog ========= v1 -> v2 - Add patch 1 for ensuring exclusive locking of the timeline - Reword git commit of patch 2.
Andi Shyti (1): drm/i915/gt: Make sure that errors are propagated through request chains
Chris Wilson (1): drm/i915: Throttle for ringspace prior to taking the timeline mutex
drivers/gpu/drm/i915/gt/intel_context.c | 41 +++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_context.h | 2 ++ drivers/gpu/drm/i915/gt/intel_migrate.c | 31 +++++++++++++------ drivers/gpu/drm/i915/i915_request.c | 3 ++ 4 files changed, 67 insertions(+), 10 deletions(-)
From: Chris Wilson chris@chris-wilson.co.uk
Before taking exclusive ownership of the ring for emitting the request, wait for space in the ring to become available. This allows others to take the timeline->mutex to make forward progresses while userspace is blocked.
In particular, this allows regular clients to issue requests on the kernel context, potentially filling the ring, but allow the higher priority heartbeats and pulses to still be submitted without being blocked by the less critical work.
Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Maciej Patelczyk maciej.patelczyk@intel.com Cc: stable@vger.kernel.org Signed-off-by: Andi Shyti andi.shyti@linux.intel.com --- Hi,
I'm not sure I need to add the Fixes tag here as this is more preparatory for the next patch. Together, though, patch 1 and 2 make the fix with proper locking mechanism.
Andi
drivers/gpu/drm/i915/gt/intel_context.c | 41 +++++++++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_context.h | 2 ++ drivers/gpu/drm/i915/i915_request.c | 3 ++ 3 files changed, 46 insertions(+)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 2aa63ec521b89..59cd612a23561 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -626,6 +626,47 @@ bool intel_context_revoke(struct intel_context *ce) return ret; }
+int intel_context_throttle(const struct intel_context *ce) +{ + const struct intel_ring *ring = ce->ring; + const struct intel_timeline *tl = ce->timeline; + struct i915_request *rq; + int err = 0; + + if (READ_ONCE(ring->space) >= SZ_1K) + return 0; + + rcu_read_lock(); + list_for_each_entry_reverse(rq, &tl->requests, link) { + if (__i915_request_is_complete(rq)) + break; + + if (rq->ring != ring) + continue; + + /* Wait until there will be enough space following that rq */ + if (__intel_ring_space(rq->postfix, + ring->emit, + ring->size) < ring->size / 2) { + if (i915_request_get_rcu(rq)) { + rcu_read_unlock(); + + if (i915_request_wait(rq, + I915_WAIT_INTERRUPTIBLE, + MAX_SCHEDULE_TIMEOUT) < 0) + err = -EINTR; + + rcu_read_lock(); + i915_request_put(rq); + } + break; + } + } + rcu_read_unlock(); + + return err; +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftest_context.c" #endif diff --git a/drivers/gpu/drm/i915/gt/intel_context.h b/drivers/gpu/drm/i915/gt/intel_context.h index 0a8d553da3f43..f919a66cebf5b 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.h +++ b/drivers/gpu/drm/i915/gt/intel_context.h @@ -226,6 +226,8 @@ static inline void intel_context_exit(struct intel_context *ce) ce->ops->exit(ce); }
+int intel_context_throttle(const struct intel_context *ce); + static inline struct intel_context *intel_context_get(struct intel_context *ce) { kref_get(&ce->ref); diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 7503dcb9043bb..a1741c4a8cffd 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1035,6 +1035,9 @@ i915_request_create(struct intel_context *ce) struct i915_request *rq; struct intel_timeline *tl;
+ if (intel_context_throttle(ce)) + return ERR_PTR(-EINTR); + tl = intel_context_timeline_lock(ce); if (IS_ERR(tl)) return ERR_CAST(tl);
Currently, when we perform operations such as clearing or copying large blocks of memory, we generate multiple requests that are executed in a chain.
However, if one of these requests fails, we may not realize it unless it happens to be the last request in the chain. This is because errors are not properly propagated.
For this we need to keep propagating the chain of fence notification in order to always reach the final fence associated to the final request.
To address this issue, we need to ensure that the chain of fence notifications is always propagated so that we can reach the final fence associated with the last request. By doing so, we will be able to detect any memory operation failures and determine whether the memory is still invalid.
On copy and clear migration signal fences upon completion.
On copy and clear migration, signal fences upon request completion to ensure that we have a reliable perpetuation of the operation outcome.
Fixes: cf586021642d80 ("drm/i915/gt: Pipelined page migration") Reported-by: Matthew Auld matthew.auld@intel.com Suggested-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Andi Shyti andi.shyti@linux.intel.com Cc: stable@vger.kernel.org Reviewed-by: Matthew Auld matthew.auld@intel.com --- drivers/gpu/drm/i915/gt/intel_migrate.c | 31 +++++++++++++++++-------- 1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_migrate.c b/drivers/gpu/drm/i915/gt/intel_migrate.c index 3f638f1987968..8a293045a7b96 100644 --- a/drivers/gpu/drm/i915/gt/intel_migrate.c +++ b/drivers/gpu/drm/i915/gt/intel_migrate.c @@ -748,7 +748,7 @@ intel_context_migrate_copy(struct intel_context *ce, rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - goto out_ce; + break; }
if (deps) { @@ -878,10 +878,14 @@ intel_context_migrate_copy(struct intel_context *ce,
/* Arbitration is re-enabled between requests. */ out_rq: - if (*out) - i915_request_put(*out); - *out = i915_request_get(rq); + i915_sw_fence_await(&rq->submit); + i915_request_get(rq); i915_request_add(rq); + if (*out) { + i915_sw_fence_complete(&(*out)->submit); + i915_request_put(*out); + } + *out = rq;
if (err) break; @@ -905,7 +909,8 @@ intel_context_migrate_copy(struct intel_context *ce, cond_resched(); } while (1);
-out_ce: + if (*out) + i915_sw_fence_complete(&(*out)->submit); return err; }
@@ -1005,7 +1010,7 @@ intel_context_migrate_clear(struct intel_context *ce, rq = i915_request_create(ce); if (IS_ERR(rq)) { err = PTR_ERR(rq); - goto out_ce; + break; }
if (deps) { @@ -1056,17 +1061,23 @@ intel_context_migrate_clear(struct intel_context *ce,
/* Arbitration is re-enabled between requests. */ out_rq: - if (*out) - i915_request_put(*out); - *out = i915_request_get(rq); + i915_sw_fence_await(&rq->submit); + i915_request_get(rq); i915_request_add(rq); + if (*out) { + i915_sw_fence_complete(&(*out)->submit); + i915_request_put(*out); + } + *out = rq; + if (err || !it.sg || !sg_dma_len(it.sg)) break;
cond_resched(); } while (1);
-out_ce: + if (*out) + i915_sw_fence_complete(&(*out)->submit); return err; }
linux-stable-mirror@lists.linaro.org