Hi everyone and especially Daniel,
this is the revised patch set to fix and rework dma_buf_poll(). The new code should avoid problems with RCU and also now correctly waits for all fences in the resv object.
The rest of the series is then the well known change to dma_resv_test_signaled(), nouveau and now new also msm.
Then last are two patches which drop the workarounds from amdgpu, but those can wait till the next cycle.
I think it would be rather good if the have at least to change to dma_buf_poll() pushed in this merge window and maybe even CC stable since this looks really broken to me.
Please review, test and/or comment.
Thanks, Christian.
Explicitly document that code can't assume that shared fences signal after the exclusive fence.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-resv.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index f26c71747d43..4ab02b6c387a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max); * @fence: the shared fence to add * * Add a fence to a shared slot, obj->lock must be held, and - * dma_resv_reserve_shared() has been called. + * dma_resv_reserve_shared() has been called. The shared fences can signal in + * any order and there is especially no guarantee that shared fences signal + * after the exclusive one. Code relying on any signaling order is broken and + * needs to be fixed. */ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) {
On Wed, Jun 16, 2021 at 10:26:49AM +0200, Christian König wrote:
Explicitly document that code can't assume that shared fences signal after the exclusive fence.
Signed-off-by: Christian König christian.koenig@amd.com
drivers/dma-buf/dma-resv.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index f26c71747d43..4ab02b6c387a 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -235,7 +235,10 @@ EXPORT_SYMBOL(dma_resv_reset_shared_max);
- @fence: the shared fence to add
- Add a fence to a shared slot, obj->lock must be held, and
- dma_resv_reserve_shared() has been called.
- dma_resv_reserve_shared() has been called. The shared fences can signal in
- any order and there is especially no guarantee that shared fences signal
- after the exclusive one. Code relying on any signaling order is broken and
- needs to be fixed.
So I agree this are reasonable semantics, but you need to audit drivers first. Because currently that's not how at least a bunch of them work. There's way more drivers than the handful you've looked at.
Imo gold standard here is what I've tried doing for the "how do we set fences" side, which is going through all of them. The trouble is that this is a bit nastier, because a) drivers play much more tricks here and b) understand each driver's scheduling logic is more work than how they set fences for a request/cs.
Unfortunately I haven't gotten around to doing that yet, because it means a few days of uninterrupted time crawling through way too much code. I haven't even found time to respin my old series to make the fence setting more consistent (since I find a few more issues there than just the amdgpu one that sparked it all). -Daniel
*/ void dma_resv_add_shared_fence(struct dma_resv *obj, struct dma_fence *fence) { -- 2.25.1
Daniel pointed me towards this function and there are multiple obvious problems in the implementation.
First of all the retry loop is not working as intended. In general the retry makes only sense if you grab the reference first and then check the sequence values.
It's also good practice to keep the reference around when installing callbacks to fences you don't own.
Then we skipped checking the exclusive fence when shared fences were present.
And last the whole implementation was unnecessary complex and rather hard to understand which could lead to probably unexpected behavior of the IOCTL.
Fix all this by reworking the implementation from scratch.
Only mildly tested and needs a thoughtful review of the code.
v2: fix the reference counting as well
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-buf.c | 133 ++++++++++++++++---------------------- include/linux/dma-buf.h | 2 +- 2 files changed, 55 insertions(+), 80 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 511fe0d217a0..b67fbf4e3705 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -72,7 +72,7 @@ static void dma_buf_release(struct dentry *dentry) * If you hit this BUG() it means someone dropped their ref to the * dma-buf while still having pending operation to the buffer. */ - BUG_ON(dmabuf->cb_shared.active || dmabuf->cb_excl.active); + BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
dmabuf->ops->release(dmabuf);
@@ -202,16 +202,20 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) wake_up_locked_poll(dcb->poll, dcb->active); dcb->active = 0; spin_unlock_irqrestore(&dcb->poll->lock, flags); + dma_fence_put(fence); }
static __poll_t dma_buf_poll(struct file *file, poll_table *poll) { + struct dma_buf_poll_cb_t *dcb; struct dma_buf *dmabuf; struct dma_resv *resv; struct dma_resv_list *fobj; struct dma_fence *fence_excl; - __poll_t events; unsigned shared_count, seq; + struct dma_fence *fence; + __poll_t events; + int r, i;
dmabuf = file->private_data; if (!dmabuf || !dmabuf->resv) @@ -225,99 +229,70 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) if (!events) return 0;
+ dcb = events & EPOLLOUT ? &dmabuf->cb_out : &dmabuf->cb_in; + + /* Only queue a new one if we are not still waiting for the old one */ + spin_lock_irq(&dmabuf->poll.lock); + if (dcb->active) + events = 0; + else + dcb->active = events; + spin_unlock_irq(&dmabuf->poll.lock); + if (!events) + return 0; + retry: seq = read_seqcount_begin(&resv->seq); rcu_read_lock();
fobj = rcu_dereference(resv->fence); - if (fobj) + if (fobj && events & EPOLLOUT) shared_count = fobj->shared_count; else shared_count = 0; - fence_excl = dma_resv_excl_fence(resv); - if (read_seqcount_retry(&resv->seq, seq)) { - rcu_read_unlock(); - goto retry; - }
- if (fence_excl && (!(events & EPOLLOUT) || shared_count == 0)) { - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_excl; - __poll_t pevents = EPOLLIN; - - if (shared_count == 0) - pevents |= EPOLLOUT; - - spin_lock_irq(&dmabuf->poll.lock); - if (dcb->active) { - dcb->active |= pevents; - events &= ~pevents; - } else - dcb->active = pevents; - spin_unlock_irq(&dmabuf->poll.lock); - - if (events & pevents) { - if (!dma_fence_get_rcu(fence_excl)) { - /* force a recheck */ - events &= ~pevents; - dma_buf_poll_cb(NULL, &dcb->cb); - } else if (!dma_fence_add_callback(fence_excl, &dcb->cb, - dma_buf_poll_cb)) { - events &= ~pevents; - dma_fence_put(fence_excl); - } else { - /* - * No callback queued, wake up any additional - * waiters. - */ - dma_fence_put(fence_excl); - dma_buf_poll_cb(NULL, &dcb->cb); - } + for (i = 0; i < shared_count; ++i) { + fence = rcu_dereference(fobj->shared[i]); + fence = dma_fence_get_rcu(fence); + if (!fence || read_seqcount_retry(&resv->seq, seq)) { + /* Concurrent modify detected, force re-check */ + dma_fence_put(fence); + rcu_read_unlock(); + goto retry; } - }
- if ((events & EPOLLOUT) && shared_count > 0) { - struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_shared; - int i; - - /* Only queue a new callback if no event has fired yet */ - spin_lock_irq(&dmabuf->poll.lock); - if (dcb->active) - events &= ~EPOLLOUT; - else - dcb->active = EPOLLOUT; - spin_unlock_irq(&dmabuf->poll.lock); - - if (!(events & EPOLLOUT)) + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb); + if (!r) { + /* Callback queued */ + events = 0; goto out; + } + dma_fence_put(fence); + }
- for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence = rcu_dereference(fobj->shared[i]); - - if (!dma_fence_get_rcu(fence)) { - /* - * fence refcount dropped to zero, this means - * that fobj has been freed - * - * call dma_buf_poll_cb and force a recheck! - */ - events &= ~EPOLLOUT; - dma_buf_poll_cb(NULL, &dcb->cb); - break; - } - if (!dma_fence_add_callback(fence, &dcb->cb, - dma_buf_poll_cb)) { - dma_fence_put(fence); - events &= ~EPOLLOUT; - break; - } + fence = dma_resv_excl_fence(resv); + if (fence) { + fence = dma_fence_get_rcu(fence); + if (!fence || read_seqcount_retry(&resv->seq, seq)) { + /* Concurrent modify detected, force re-check */ dma_fence_put(fence); + rcu_read_unlock(); + goto retry; + }
- /* No callback queued, wake up any additional waiters. */ - if (i == shared_count) - dma_buf_poll_cb(NULL, &dcb->cb); + r = dma_fence_add_callback(fence, &dcb->cb, dma_buf_poll_cb); + if (!r) { + /* Callback queued */ + events = 0; + goto out; + } + dma_fence_put(fence_excl); }
+ /* No callback queued, wake up any additional waiters. */ + dma_buf_poll_cb(NULL, &dcb->cb); + out: rcu_read_unlock(); return events; @@ -562,8 +537,8 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) dmabuf->owner = exp_info->owner; spin_lock_init(&dmabuf->name_lock); init_waitqueue_head(&dmabuf->poll); - dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll; - dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0; + dmabuf->cb_in.poll = dmabuf->cb_out.poll = &dmabuf->poll; + dmabuf->cb_in.active = dmabuf->cb_out.active = 0;
if (!resv) { resv = (struct dma_resv *)&dmabuf[1]; diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index efdc56b9d95f..7e747ad54c81 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -329,7 +329,7 @@ struct dma_buf { wait_queue_head_t *poll;
__poll_t active; - } cb_excl, cb_shared; + } cb_in, cb_out; };
/**
As the name implies if testing all fences is requested we should indeed test all fences and not skip the exclusive one because we see shared ones.
v2: fix logic once more
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/dma-buf/dma-resv.c | 33 ++++++++++++--------------------- 1 file changed, 12 insertions(+), 21 deletions(-)
diff --git a/drivers/dma-buf/dma-resv.c b/drivers/dma-buf/dma-resv.c index 4ab02b6c387a..18dd5a6ca06c 100644 --- a/drivers/dma-buf/dma-resv.c +++ b/drivers/dma-buf/dma-resv.c @@ -618,25 +618,21 @@ static inline int dma_resv_test_signaled_single(struct dma_fence *passed_fence) */ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) { - unsigned int seq, shared_count; + struct dma_fence *fence; + unsigned int seq; int ret;
rcu_read_lock(); retry: ret = true; - shared_count = 0; seq = read_seqcount_begin(&obj->seq);
if (test_all) { struct dma_resv_list *fobj = dma_resv_shared_list(obj); - unsigned int i; - - if (fobj) - shared_count = fobj->shared_count; + unsigned int i, shared_count;
+ shared_count = fobj ? fobj->shared_count : 0; for (i = 0; i < shared_count; ++i) { - struct dma_fence *fence; - fence = rcu_dereference(fobj->shared[i]); ret = dma_resv_test_signaled_single(fence); if (ret < 0) @@ -644,24 +640,19 @@ bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all) else if (!ret) break; } - - if (read_seqcount_retry(&obj->seq, seq)) - goto retry; }
- if (!shared_count) { - struct dma_fence *fence_excl = dma_resv_excl_fence(obj); - - if (fence_excl) { - ret = dma_resv_test_signaled_single(fence_excl); - if (ret < 0) - goto retry; + fence = dma_resv_excl_fence(obj); + if (ret && fence) { + ret = dma_resv_test_signaled_single(fence); + if (ret < 0) + goto retry;
- if (read_seqcount_retry(&obj->seq, seq)) - goto retry; - } }
+ if (read_seqcount_retry(&obj->seq, seq)) + goto retry; + rcu_read_unlock(); return ret; }
Drivers also need to to sync to the exclusive fence when a shared one is present.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c index 6b43918035df..05d0b3eb3690 100644 --- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -358,7 +358,7 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan, bool e fobj = dma_resv_shared_list(resv); fence = dma_resv_excl_fence(resv);
- if (fence && (!exclusive || !fobj || !fobj->shared_count)) { + if (fence) { struct nouveau_channel *prev = NULL; bool must_wait = true;
Drivers also need to to sync to the exclusive fence when a shared one is present.
Completely untested since the driver won't even compile on !ARM.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/msm/msm_gem.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index a94a43de95ef..72a07e311de3 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -817,17 +817,15 @@ int msm_gem_sync_object(struct drm_gem_object *obj, struct dma_fence *fence; int i, ret;
- fobj = dma_resv_shared_list(obj->resv); - if (!fobj || (fobj->shared_count == 0)) { - fence = dma_resv_excl_fence(obj->resv); - /* don't need to wait on our own fences, since ring is fifo */ - if (fence && (fence->context != fctx->context)) { - ret = dma_fence_wait(fence, true); - if (ret) - return ret; - } + fence = dma_resv_excl_fence(obj->resv); + /* don't need to wait on our own fences, since ring is fifo */ + if (fence && (fence->context != fctx->context)) { + ret = dma_fence_wait(fence, true); + if (ret) + return ret; }
+ fobj = dma_resv_shared_list(obj->resv); if (!exclusive || !fobj) return 0;
We no longer need to add the exclusive fence as shared fence as welldrm/amdgpu: drop workaround for adding page table clears as shared fence
We no longer need to add the exclusive fence as shared fence as well..
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 8 +------- 1 file changed, 1 insertion(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 0780e8d18992..d7baa207f391 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -207,7 +207,7 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, INIT_LIST_HEAD(&duplicates);
tv.bo = &bo->tbo; - tv.num_shared = 2; + tv.num_shared = 1; list_add(&tv.head, &list);
amdgpu_vm_get_pd_bo(vm, &list, &vm_pd); @@ -226,12 +226,6 @@ static void amdgpu_gem_object_close(struct drm_gem_object *obj, if (!amdgpu_vm_ready(vm)) goto out_unlock;
- fence = dma_resv_excl_fence(bo->tbo.base.resv); - if (fence) { - amdgpu_bo_fence(bo, fence, true); - fence = NULL; - } - r = amdgpu_vm_clear_freed(adev, vm, &fence); if (r || !fence) goto out_unlock;
Drop the workaround adding the shared fence manually in the CS.
Signed-off-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 25655414e9c0..af8f5ff5f12c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1273,14 +1273,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, /* * Work around dma_resv shortcommings by wrapping up the * submission in a dma_fence_chain and add it as exclusive - * fence, but first add the submission as shared fence to make - * sure that shared fences never signal before the exclusive - * one. + * fence. */ dma_fence_chain_init(chain, dma_resv_excl_fence(resv), dma_fence_get(p->fence), 1); - - dma_resv_add_shared_fence(resv, p->fence); rcu_assign_pointer(resv->fence_excl, &chain->base); e->chain = NULL; }
linaro-mm-sig@lists.linaro.org