On Wed, Aug 14, 2019 at 05:42:48PM +0100, Chris Wilson wrote:
Quoting Daniel Vetter (2019-08-14 16:39:08)
Sorry I burried myself in some other stuff ...
On Sat, Aug 10, 2019 at 12:51:00PM +0200, Christian König wrote:
Am 07.08.19 um 16:17 schrieb Chris Wilson:
Quoting Christian König (2019-08-07 14:53:12)
The only remaining use for this is to protect against setting a new exclusive fence while we grab both exclusive and shared. That can also be archived by looking if the exclusive fence has changed or not after completing the operation.
v2: switch setting excl fence to rcu_assign_pointer
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/reservation.c | 24 +++++------------------- include/linux/reservation.h | 9 ++------- 2 files changed, 7 insertions(+), 26 deletions(-)
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c index 90bc6ef03598..f7f4a0858c2a 100644 --- a/drivers/dma-buf/reservation.c +++ b/drivers/dma-buf/reservation.c @@ -49,12 +49,6 @@ DEFINE_WD_CLASS(reservation_ww_class); EXPORT_SYMBOL(reservation_ww_class); -struct lock_class_key reservation_seqcount_class; -EXPORT_SYMBOL(reservation_seqcount_class);
-const char reservation_seqcount_string[] = "reservation_seqcount"; -EXPORT_SYMBOL(reservation_seqcount_string);
- /**
- reservation_object_list_alloc - allocate fence list
- @shared_max: number of fences we need space for
@@ -103,9 +97,6 @@ static void reservation_object_list_free(struct reservation_object_list *list) void reservation_object_init(struct reservation_object *obj) { ww_mutex_init(&obj->lock, &reservation_ww_class);
__seqcount_init(&obj->seq, reservation_seqcount_string,
}&reservation_seqcount_class); RCU_INIT_POINTER(obj->fence, NULL); RCU_INIT_POINTER(obj->fence_excl, NULL);
@@ -282,12 +273,10 @@ void reservation_object_add_excl_fence(struct reservation_object *obj, dma_fence_get(fence); preempt_disable();
write_seqcount_begin(&obj->seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(obj->fence_excl, fence);
rcu_assign_pointer(obj->fence_excl, fence);
/* pointer update must be visible before we modify the shared_count */
Pls add a "see reservation_object_fence()" here or similar.
if (old)
old->shared_count = 0;
write_seqcount_end(&obj->seq);
smp_store_mb(old->shared_count, 0);
So your comment and the kerneldoc don't match up. Quoting Documentation/memory-barriers.txt:
This assigns the value to the variable and then inserts a full memory barrier after it. It isn't guaranteed to insert anything more than a compiler barrier in a UP compilation.
So order is 1. store 2. fence, but your comment suggests you want it the other way round.
What's more weird is that it is a fully serialising instruction that is used to fence first as part of the update. If that's way PeterZ uses it...
I haven't looked at the implementations tbh, just going with the text. Or do you mean in the write_seqlock that we're replacing?
preempt_enable(); /* inplace update, no shared fences */
@@ -368,11 +357,8 @@ int reservation_object_copy_fences(struct reservation_object *dst, old = reservation_object_get_excl(dst); preempt_disable();
write_seqcount_begin(&dst->seq);
/* write_seqcount_begin provides the necessary memory barrier */
RCU_INIT_POINTER(dst->fence_excl, new);
RCU_INIT_POINTER(dst->fence, dst_list);
write_seqcount_end(&dst->seq);
rcu_assign_pointer(dst->fence_excl, new);
rcu_assign_pointer(dst->fence, dst_list); preempt_enable(); reservation_object_list_free(src_list);
diff --git a/include/linux/reservation.h b/include/linux/reservation.h index 044a5cd4af50..fd29baad0be3 100644 --- a/include/linux/reservation.h +++ b/include/linux/reservation.h @@ -46,8 +46,6 @@ #include <linux/rcupdate.h> extern struct ww_class reservation_ww_class; -extern struct lock_class_key reservation_seqcount_class; -extern const char reservation_seqcount_string[]; /**
- struct reservation_object_list - a list of shared fences
@@ -71,7 +69,6 @@ struct reservation_object_list { */ struct reservation_object { struct ww_mutex lock;
seqcount_t seq; struct dma_fence __rcu *fence_excl; struct reservation_object_list __rcu *fence;
@@ -156,14 +153,12 @@ reservation_object_fences(struct reservation_object *obj, struct reservation_object_list **list, u32 *shared_count) {
unsigned int seq;
do {
seq = read_seqcount_begin(&obj->seq); *excl = rcu_dereference(obj->fence_excl);
I think you need a barrier between this and the read of shared_count below. But rcu_derefence only gives you a dependent barrier, i.e. only stuff that's accesses through this pointer is ordered. Which means the access to ->shared_count, which goes through another pointer, isn't actually ordered.
Well,
do { excl = ... smp_rmb(); (list, count) = ... smp_rmb(); } while (excl != ...)
would be the straightforward conversion of the seqlock.
Yeah I cheated by looking there, and couldn't convince myself that we can't drop the first smp_rmb() ...
I think the implementation is that it is an unconditional compiler barrier (but that might change), but you're definitely missing the cpu barrier, so a cpue might speculate the entire thing out of order.
I think you need another smb_rmb(); here
*list = rcu_dereference(obj->fence); *shared_count = *list ? (*list)->shared_count : 0;
} while (read_seqcount_retry(&obj->seq, seq));
smp_rmb(); /* See reservation_object_add_excl_fence */
This fence here I think prevents the re-reading of ->fence_excl from getting hoisted above the critical reads. So this is just the open-coded seqlock retry loop.
Without the seq.
The dilemma for dropping the seq would be what if we were to add another state here, such as modified or even invalidate.
} while (rcu_access_pointer(obj->fence_excl) != *excl);
What if someone is real fast (like really real fast) and recycles the exclusive fence so you read the same pointer twice, but everything else changed? reused fence pointer is a lot more likely than seqlock wrapping around.
It's an exclusive fence. If it is replaced, it must be later than all the shared fences (and dependent on them directly or indirectly), and so still a consistent snapshot.
I'm not worried about the fence, that part is fine. But we're defacto using the fence as a fancy seqlock-of-sorts. And if the fence gets reused and the pointers match, then our seqlock-of-sorts breaks. But I haven't looked around whether there's more in the code that makes this an irrelevant issue. -Daniel