On 17/09/2021 13:35, Christian König wrote:
Simplifying the code a bit.
v2: add missing rcu read unlock.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/gpu/drm/i915/gem/i915_gem_wait.c | 57 ++++++------------------ 1 file changed, 14 insertions(+), 43 deletions(-)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_wait.c b/drivers/gpu/drm/i915/gem/i915_gem_wait.c index f909aaa09d9c..e416cf528635 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_wait.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_wait.c @@ -37,55 +37,26 @@ i915_gem_object_wait_reservation(struct dma_resv *resv, unsigned int flags, long timeout) {
- struct dma_fence *excl;
- bool prune_fences = false;
- if (flags & I915_WAIT_ALL) {
struct dma_fence **shared;
unsigned int count, i;
int ret;
ret = dma_resv_get_fences(resv, &excl, &count, &shared);
if (ret)
return ret;
for (i = 0; i < count; i++) {
timeout = i915_gem_object_wait_fence(shared[i],
flags, timeout);
if (timeout < 0)
break;
dma_fence_put(shared[i]);
}
for (; i < count; i++)
dma_fence_put(shared[i]);
kfree(shared);
/*
* If both shared fences and an exclusive fence exist,
* then by construction the shared fences must be later
* than the exclusive fence. If we successfully wait for
* all the shared fences, we know that the exclusive fence
* must all be signaled. If all the shared fences are
* signaled, we can prune the array and recover the
* floating references on the fences/requests.
*/
prune_fences = count && timeout >= 0;
- } else {
excl = dma_resv_get_excl_unlocked(resv);
- struct dma_resv_iter cursor;
- struct dma_fence *fence;
- rcu_read_lock();
- dma_resv_iter_begin(&cursor, resv, flags & I915_WAIT_ALL);
- dma_resv_for_each_fence_unlocked(&cursor, fence) {
rcu_read_unlock();
timeout = i915_gem_object_wait_fence(fence, flags, timeout);
Converting this one could be problematic. It's the wait ioctl which used to grab an atomic snapshot and wait for that rendering to complete. With this change I think it has the potential to run forever keeps catching new activity against the same object.
I am not sure whether or not the difference is relevant for how userspace uses it but I think needs discussion.
Hm actually there are internal callers as well, and at least some of those have the object locked. Would a wider refactoring to separate those into buckets (locked vs unlocked) make sense?
Regards,
Tvrtko
rcu_read_lock();
if (timeout < 0)
}break;
- if (excl && timeout >= 0)
timeout = i915_gem_object_wait_fence(excl, flags, timeout);
- dma_fence_put(excl);
- dma_resv_iter_end(&cursor);
- rcu_read_unlock();
/* * Opportunistically prune the fences iff we know they have *all* been * signaled. */
- if (prune_fences)
- if (timeout > 0) dma_resv_prune(resv);
return timeout;