Quoting Chris Wilson (2019-08-14 18:38:20)
Quoting Chris Wilson (2019-08-14 18:22:53)
Quoting Chris Wilson (2019-08-14 18:06:18)
Quoting Chris Wilson (2019-08-14 17:42:48)
Quoting Daniel Vetter (2019-08-14 16:39:08)
> > + } while (rcu_access_pointer(obj->fence_excl) != *excl);
What if someone is real fast (like really real fast) and recycles the exclusive fence so you read the same pointer twice, but everything else changed? reused fence pointer is a lot more likely than seqlock wrapping around.
It's an exclusive fence. If it is replaced, it must be later than all the shared fences (and dependent on them directly or indirectly), and so still a consistent snapshot.
An extension of that argument says we don't even need to loop,
*list = rcu_dereference(obj->fence); *shared_count = *list ? (*list)->shared_count : 0; smp_rmb(); *excl = rcu_dereference(obj->fence_excl);
Gives a consistent snapshot. It doesn't matter if the fence_excl is before or after the shared_list -- if it's after, it's a superset of all fences, if it's before, we have a correct list of shared fences the earlier fence_excl.
The problem is that the point of the loop is that we do need a check on the fences after the full memory barrier.
Thinking of the rationale beaten out for dma_fence_get_excl_rcu_safe()
We don't have a full memory barrier here, so this cannot be used safely in light of fence reallocation.
i.e. we need to restore the loops in the callers,
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_busy.c b/drivers/gpu/drm/i915/gem/i915_gem_busy.c index a2aff1d8290e..f019062c8cd7 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_busy.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_busy.c @@ -110,6 +110,7 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, * to report the overall busyness. This is what the wait-ioctl does. * */ +retry: dma_resv_fences(obj->base.resv, &excl, &list, &shared_count);
/* Translate the exclusive fence to the READ *and* WRITE engine */
@@ -122,6 +123,10 @@ i915_gem_busy_ioctl(struct drm_device *dev, void *data, args->busy |= busy_check_reader(fence); }
smp_rmb();
if (excl != rcu_access_pointer(obj->base.resv->fence_excl))
goto retry;
wrap that up as
static inline bool dma_resv_fences_retry(struct dma_resv *resv, struct dma_fence *excl) { smp_rmb(); return excl != rcu_access_pointer(resv->fence_excl); }
I give up. It's not just the fence_excl that's an issue here.
Any of the shared fences may be replaced after dma_resv_fences() and so the original shared fence pointer may be reassigned (even under RCU). The only defense against that is the seqcount.
I totally screwed that up. -Chris