This is the new dma_fence_array based container for shared fences in the dma_resv object.
Advantage of this approach is that you can grab a reference to the current set of shared fences at any time, which allows us to drop the sequence number increment and makes the whole RCU handling much more easier.
Disadvantage is that RCU users now have to grab a reference instead of using the sequence counter. As far as I can see i915 was actually the only driver doing this.
So we optimize for adding more fences instead of reading them now.
Another behavior change worth noting is that the shared fences are now only visible after unlocking the dma_resv object or calling dma_resv_fences_commit() manually.
Please review and/or comment,
Christian.
Hi everyone,
In previous discussion it surfaced that different drivers use the shared and explicit fences in the dma_resv object with different meanings.
This is problematic when we share buffers between those drivers and requirements for implicit and explicit synchronization leaded to quite a number of workarounds related to this.
So I started an effort to get all drivers back to a common understanding of what the fences in the dma_resv object mean and be able to use the object for different kind of workloads independent of the classic DRM command submission interface.
The result is this patch set which modifies the dma_resv API to get away from a single explicit fence and multiple shared fences, towards a notation where we have explicit categories for writers, readers and others.
To do this I came up with a new container called dma_resv_fences which can store both a single fence as well as multiple fences in a dma_fence_array.
This turned out to actually be even be quite a bit simpler, since we don't need any complicated dance between RCU and sequence count protected updates any more.
Instead we can just grab a reference to the dma_fence_array under RCU and so keep the current state of synchronization alive until we are done with it.
This results in both a small performance improvement since we don't need so many barriers any more, as well as fewer lines of code in the actual implementation.
Please review and/or comment,
Christian.
We clear the callback list on kref_put so that by the time we
release the fence it is unused. No one should be adding to the cb_list
that they don't themselves hold a reference for.
This small change is actually making the structure 16% smaller.
v2: add the comment to the code as well.
Signed-off-by: Christian König <christian.koenig(a)amd.com>
Reviewed-by: Chris Wilson <chris(a)chris-wilson.co.uk>
---
include/linux/dma-fence.h | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 05d29dbc7e62..bea1d05cf51e 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -65,8 +65,14 @@ struct dma_fence_cb;
struct dma_fence {
struct kref refcount;
const struct dma_fence_ops *ops;
- struct rcu_head rcu;
- struct list_head cb_list;
+ /* We clear the callback list on kref_put so that by the time we
+ * release the fence it is unused. No one should be adding to the cb_list
+ * that they don't themselves hold a reference for.
+ */
+ union {
+ struct rcu_head rcu;
+ struct list_head cb_list;
+ };
spinlock_t *lock;
u64 context;
u64 seqno;
--
2.17.1