Am 30.11.21 um 20:27 schrieb Thomas Hellström:
>
> On 11/30/21 19:12, Thomas Hellström wrote:
>> On Tue, 2021-11-30 at 16:02 +0100, Christian König wrote:
>>> Am 30.11.21 um 15:35 schrieb Thomas Hellström:
>>>> On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote:
>>>>> Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>>>>>> On 11/30/21 13:42, Christian König wrote:
>>>>>>> Am 30.11.21 um 13:31 schrieb …
[View More]Thomas Hellström:
>>>>>>>> [SNIP]
>>>>>>>>> Other than that, I didn't investigate the nesting fails
>>>>>>>>> enough to
>>>>>>>>> say I can accurately review this. :)
>>>>>>>> Basically the problem is that within enable_signaling()
>>>>>>>> which
>>>>>>>> is
>>>>>>>> called with the dma_fence lock held, we take the dma_fence
>>>>>>>> lock
>>>>>>>> of
>>>>>>>> another fence. If that other fence is a dma_fence_array, or
>>>>>>>> a
>>>>>>>> dma_fence_chain which in turn tries to lock a
>>>>>>>> dma_fence_array
>>>>>>>> we hit
>>>>>>>> a splat.
>>>>>>> Yeah, I already thought that you constructed something like
>>>>>>> that.
>>>>>>>
>>>>>>> You get the splat because what you do here is illegal, you
>>>>>>> can't
>>>>>>> mix
>>>>>>> dma_fence_array and dma_fence_chain like this or you can end
>>>>>>> up
>>>>>>> in a
>>>>>>> stack corruption.
>>>>>> Hmm. Ok, so what is the stack corruption, is it that the
>>>>>> enable_signaling() will end up with endless recursion? If so,
>>>>>> wouldn't
>>>>>> it be more usable we break that recursion chain and allow a
>>>>>> more
>>>>>> general use?
>>>>> The problem is that this is not easily possible for
>>>>> dma_fence_array
>>>>> containers. Just imagine that you drop the last reference to the
>>>>> containing fences during dma_fence_array destruction if any of
>>>>> the
>>>>> contained fences is another container you can easily run into
>>>>> recursion
>>>>> and with that stack corruption.
>>>> Indeed, that would require some deeper surgery.
>>>>
>>>>> That's one of the major reasons I came up with the
>>>>> dma_fence_chain
>>>>> container. This one you can chain any number of elements together
>>>>> without running into any recursion.
>>>>>
>>>>>> Also what are the mixing rules between these? Never use a
>>>>>> dma-fence-chain as one of the array fences and never use a
>>>>>> dma-fence-array as a dma-fence-chain fence?
>>>>> You can't add any other container to a dma_fence_array, neither
>>>>> other
>>>>> dma_fence_array instances nor dma_fence_chain instances.
>>>>>
>>>>> IIRC at least technically a dma_fence_chain can contain a
>>>>> dma_fence_array if you absolutely need that, but Daniel, Jason
>>>>> and I
>>>>> already had the same discussion a while back and came to the
>>>>> conclusion
>>>>> to avoid that as well if possible.
>>>> Yes, this is actually the use-case. But what I can't easily
>>>> guarantee
>>>> is that that dma_fence_chain isn't fed into a dma_fence_array
>>>> somewhere
>>>> else. How do you typically avoid that?
>>>>
>>>> Meanwhile I guess I need to take a different approach in the driver
>>>> to
>>>> avoid this altogether.
>>> Jason and I came up with a deep dive iterator for his use case, but I
>>> think we don't want to use that any more after my dma_resv rework.
>>>
>>> In other words when you need to create a new dma_fence_array you
>>> flatten
>>> out the existing construct which is at worst case
>>> dma_fence_chain->dma_fence_array->dma_fence.
>> Ok, Are there any cross-driver contract here, Like every driver using a
>> dma_fence_array need to check for dma_fence_chain and flatten like
>> above?
So far we only discussed that on the mailing list but haven't made any
documentation for that.
>>
>> /Thomas
>
> Oh, and a follow up question:
>
> If there was a way to break the recursion on final put() (using the
> same basic approach as patch 2 in this series uses to break recursion
> in enable_signaling()), so that none of these containers did require
> any special treatment, would it be worth pursuing? I guess it might be
> possible by having the callbacks drop the references rather than the
> loop in the final put. + a couple of changes in code iterating over
> the fence pointers.
That won't really help, you just move the recursion from the final put
into the callback.
What could be possible is to use an work item for any possible
operation, e.g. enabling, signaling and destruction. But in the last
discussion everybody agreed that it is better to just flatten out the array.
Christian.
>
>
> /Thomas
>
>>
>>> Regards,
>>> Christian.
>>>
>>>> /Thomas
>>>>
>>>>
>>>>> Regards,
>>>>> Christian.
>>>>>
>>>>>> /Thomas
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Regards,
>>>>>>> Christian.
>>>>>>>
>>>>>>>> But I'll update the commit message with a typical splat.
>>>>>>>>
>>>>>>>> /Thomas
[View Less]
Am 30.11.21 um 15:35 schrieb Thomas Hellström:
> On Tue, 2021-11-30 at 14:26 +0100, Christian König wrote:
>> Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>>> On 11/30/21 13:42, Christian König wrote:
>>>> Am 30.11.21 um 13:31 schrieb Thomas Hellström:
>>>>> [SNIP]
>>>>>> Other than that, I didn't investigate the nesting fails
>>>>>> enough to
>>>>>> say I can accurately review this. :)
>>…
[View More]>>> Basically the problem is that within enable_signaling() which
>>>>> is
>>>>> called with the dma_fence lock held, we take the dma_fence lock
>>>>> of
>>>>> another fence. If that other fence is a dma_fence_array, or a
>>>>> dma_fence_chain which in turn tries to lock a dma_fence_array
>>>>> we hit
>>>>> a splat.
>>>> Yeah, I already thought that you constructed something like that.
>>>>
>>>> You get the splat because what you do here is illegal, you can't
>>>> mix
>>>> dma_fence_array and dma_fence_chain like this or you can end up
>>>> in a
>>>> stack corruption.
>>> Hmm. Ok, so what is the stack corruption, is it that the
>>> enable_signaling() will end up with endless recursion? If so,
>>> wouldn't
>>> it be more usable we break that recursion chain and allow a more
>>> general use?
>> The problem is that this is not easily possible for dma_fence_array
>> containers. Just imagine that you drop the last reference to the
>> containing fences during dma_fence_array destruction if any of the
>> contained fences is another container you can easily run into
>> recursion
>> and with that stack corruption.
> Indeed, that would require some deeper surgery.
>
>> That's one of the major reasons I came up with the dma_fence_chain
>> container. This one you can chain any number of elements together
>> without running into any recursion.
>>
>>> Also what are the mixing rules between these? Never use a
>>> dma-fence-chain as one of the array fences and never use a
>>> dma-fence-array as a dma-fence-chain fence?
>> You can't add any other container to a dma_fence_array, neither other
>> dma_fence_array instances nor dma_fence_chain instances.
>>
>> IIRC at least technically a dma_fence_chain can contain a
>> dma_fence_array if you absolutely need that, but Daniel, Jason and I
>> already had the same discussion a while back and came to the
>> conclusion
>> to avoid that as well if possible.
> Yes, this is actually the use-case. But what I can't easily guarantee
> is that that dma_fence_chain isn't fed into a dma_fence_array somewhere
> else. How do you typically avoid that?
>
> Meanwhile I guess I need to take a different approach in the driver to
> avoid this altogether.
Jason and I came up with a deep dive iterator for his use case, but I
think we don't want to use that any more after my dma_resv rework.
In other words when you need to create a new dma_fence_array you flatten
out the existing construct which is at worst case
dma_fence_chain->dma_fence_array->dma_fence.
Regards,
Christian.
>
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>
>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>> But I'll update the commit message with a typical splat.
>>>>>
>>>>> /Thomas
>
[View Less]
Am 30.11.21 um 13:56 schrieb Thomas Hellström:
>
> On 11/30/21 13:42, Christian König wrote:
>> Am 30.11.21 um 13:31 schrieb Thomas Hellström:
>>> [SNIP]
>>>> Other than that, I didn't investigate the nesting fails enough to
>>>> say I can accurately review this. :)
>>>
>>> Basically the problem is that within enable_signaling() which is
>>> called with the dma_fence lock held, we take the dma_fence lock of
>>> …
[View More]another fence. If that other fence is a dma_fence_array, or a
>>> dma_fence_chain which in turn tries to lock a dma_fence_array we hit
>>> a splat.
>>
>> Yeah, I already thought that you constructed something like that.
>>
>> You get the splat because what you do here is illegal, you can't mix
>> dma_fence_array and dma_fence_chain like this or you can end up in a
>> stack corruption.
>
> Hmm. Ok, so what is the stack corruption, is it that the
> enable_signaling() will end up with endless recursion? If so, wouldn't
> it be more usable we break that recursion chain and allow a more
> general use?
The problem is that this is not easily possible for dma_fence_array
containers. Just imagine that you drop the last reference to the
containing fences during dma_fence_array destruction if any of the
contained fences is another container you can easily run into recursion
and with that stack corruption.
That's one of the major reasons I came up with the dma_fence_chain
container. This one you can chain any number of elements together
without running into any recursion.
> Also what are the mixing rules between these? Never use a
> dma-fence-chain as one of the array fences and never use a
> dma-fence-array as a dma-fence-chain fence?
You can't add any other container to a dma_fence_array, neither other
dma_fence_array instances nor dma_fence_chain instances.
IIRC at least technically a dma_fence_chain can contain a
dma_fence_array if you absolutely need that, but Daniel, Jason and I
already had the same discussion a while back and came to the conclusion
to avoid that as well if possible.
Regards,
Christian.
>
> /Thomas
>
>
>
>
>>
>> Regards,
>> Christian.
>>
>>>
>>> But I'll update the commit message with a typical splat.
>>>
>>> /Thomas
>>
[View Less]
Am 30.11.21 um 13:31 schrieb Thomas Hellström:
> [SNIP]
>> Other than that, I didn't investigate the nesting fails enough to say
>> I can accurately review this. :)
>
> Basically the problem is that within enable_signaling() which is
> called with the dma_fence lock held, we take the dma_fence lock of
> another fence. If that other fence is a dma_fence_array, or a
> dma_fence_chain which in turn tries to lock a dma_fence_array we hit a
> splat.
Yeah, I …
[View More]already thought that you constructed something like that.
You get the splat because what you do here is illegal, you can't mix
dma_fence_array and dma_fence_chain like this or you can end up in a
stack corruption.
Regards,
Christian.
>
> But I'll update the commit message with a typical splat.
>
> /Thomas
[View Less]
Am 30.11.21 um 13:19 schrieb Thomas Hellström:
> Introducing more usage of dma-fence-chain and dma-fence-array in the
> i915 driver we start to hit lockdep splats due to the recursive fence
> locking in the dma-fence-chain and dma-fence-array containers.
> This is a humble suggestion to try to establish a dma-fence locking order
> (patch 1) and to avoid excessive recursive locking in these containers
> (patch 2)
Well completely NAK to this.
This splats are intentional notes …
[View More]that something in the driver code is
wrong (or we messed up the chain and array containers somehow).
Those two containers are intentionally crafted in a way which allows to
avoid any dependency between their spinlocks. So as long as you
correctly use them you should never see a splat.
Please provide the lockdep splat so that we can analyze the problem.
Thanks,
Christian.
>
> Thomas Hellström (2):
> dma-fence: Avoid establishing a locking order between fence classes
> dma-fence: Avoid excessive recursive fence locking from
> enable_signaling() callbacks
>
> drivers/dma-buf/dma-fence-array.c | 23 +++++++--
> drivers/dma-buf/dma-fence-chain.c | 12 ++++-
> drivers/dma-buf/dma-fence.c | 79 +++++++++++++++++++++----------
> include/linux/dma-fence.h | 4 ++
> 4 files changed, 89 insertions(+), 29 deletions(-)
>
> Cc: linaro-mm-sig(a)lists.linaro.org
> Cc: dri-devel(a)lists.freedesktop.org
> Cc: Christian König <christian.koenig(a)amd.com>
>
[View Less]
Hi everyone,
compared to the last version I've dropped the pruning as suggested by
Maarten, split the new DMA_RESV_USAGE_* patches from the general
introduction as suggeted by Daniel and renamed OTEHRS to BOOKKEEP as
suggested by Pekka.
Please take a look and review,
Christian.
Am 29.11.21 um 13:46 schrieb Thomas Hellström:
> On Mon, 2021-11-29 at 13:33 +0100, Christian König wrote:
>> Am 29.11.21 um 13:23 schrieb Thomas Hellström:
>>> Hi, Christian,
>>>
>>> On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote:
>>>> Am 29.11.21 um 08:35 schrieb Thomas Hellström:
>>>>> If a dma_fence_array is reported signaled by a call to
>>>>> dma_fence_is_signaled(), it may leak the PENDING_ERROR …
[View More]status.
>>>>>
>>>>> Fix this by clearing the PENDING_ERROR status if we return true
>>>>> in
>>>>> dma_fence_array_signaled().
>>>>>
>>>>> Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence-
>>>>> array container")
>>>>> Cc: linaro-mm-sig(a)lists.linaro.org
>>>>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com>
>>>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
>>>>> Signed-off-by: Thomas Hellström
>>>>> <thomas.hellstrom(a)linux.intel.com>
>>>> Reviewed-by: Christian König <christian.koenig(a)amd.com>
>>> How are the dma-buf / dma-fence patches typically merged? If i915
>>> is
>>> the only fence->error user, could we take this through drm-intel to
>>> avoid a backmerge for upcoming i915 work?
>> Well that one here looks like a bugfix to me, so either through
>> drm-misc-fixes ore some i915 -fixes branch sounds fine to me.
>>
>> If you have any new development based on that a backmerge of the -
>> fixes
>> into your -next branch is unavoidable anyway.
> Ok, I'll check with Joonas if I can take it through
> drm-intel-gt-next, since fixes are cherry-picked from that one. Patch
> will then appear in both the -fixes and the -next branch.
Well exactly that's the stuff Daniel told me to avoid :)
But maybe your i915 workflow is somehow better handling that than the
AMD workflow.
Christian.
>
> Thanks,
> /Thomas
>
>
>> Regards,
>> Christian.
>>
>>> /Thomas
>>>
>>>
>>>>> ---
>>>>> drivers/dma-buf/dma-fence-array.c | 6 +++++-
>>>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-
>>>>> buf/dma-fence-array.c
>>>>> index d3fbd950be94..3e07f961e2f3 100644
>>>>> --- a/drivers/dma-buf/dma-fence-array.c
>>>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>>>> @@ -104,7 +104,11 @@ static bool
>>>>> dma_fence_array_signaled(struct
>>>>> dma_fence *fence)
>>>>> {
>>>>> struct dma_fence_array *array =
>>>>> to_dma_fence_array(fence);
>>>>>
>>>>> - return atomic_read(&array->num_pending) <= 0;
>>>>> + if (atomic_read(&array->num_pending) > 0)
>>>>> + return false;
>>>>> +
>>>>> + dma_fence_array_clear_pending_error(array);
>>>>> + return true;
>>>>> }
>>>>>
>>>>> static void dma_fence_array_release(struct dma_fence *fence)
>
[View Less]
On Tue, Nov 23, 2021 at 03:21:07PM +0100, Christian König wrote:
> This change adds the dma_resv_usage enum and allows us to specify why a
> dma_resv object is queried for its containing fences.
>
> Additional to that a dma_resv_usage_rw() helper function is added to aid
> retrieving the fences for a read or write userspace submission.
>
> This is then deployed to the different query functions of the dma_resv
> object and all of their users.
>
> Signed-off-by: …
[View More]Christian König <christian.koenig(a)amd.com>
Just a few comments on the kenreldoc while I scroll through.
> EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);
> diff --git a/include/linux/dma-resv.h b/include/linux/dma-resv.h
> index 062571c04bca..37552935bca6 100644
> --- a/include/linux/dma-resv.h
> +++ b/include/linux/dma-resv.h
> @@ -49,6 +49,86 @@ extern struct ww_class reservation_ww_class;
>
> struct dma_resv_list;
>
> +/**
> + * enum dma_resv_usage - how the fences from a dma_resv obj are used
> + *
> + * This enum describes the different use cases for a dma_resv object and
> + * controls which fences are returned when queried.
> + *
> + * An important fact is that there is the order KERNEL<WRITE<READ<OTHER and
> + * when the dma_resv object is asked for fences for one use case the fences
> + * for the lower use case are returned as well.
Might be good to replicate this to all functions that take a
dma_resv_usage flag, and then also add a "See enum dma_resv_usage for more
information." so we get a clickable hyperlink too.
> + *
> + * For example when asking for WRITE fences then the KERNEL fences are returned
> + * as well. Similar when asked for READ fences then both WRITE and KERNEL
> + * fences are returned as well.
> + */
> +enum dma_resv_usage {
> + /**
> + * @DMA_RESV_USAGE_KERNEL: For in kernel memory management only.
> + *
> + * This should only be used for things like copying or clearing memory
> + * with a DMA hardware engine for the purpose of kernel memory
> + * management.
> + *
> + * Drivers *always* need to wait for those fences before accessing the
> + * resource protected by the dma_resv object. The only exception for
> + * that is when the resource is known to be locked down in place by
> + * pinning it previously.
Should dma_buf_pin also do the wait for kernel fences? I think that would
also ba e bit clearer semantics in the dma-buf patch which does these
waits for us.
Or should dma_buf_pin be pipelined and it's up to callers to wait? For kms
that's definitely the semantics we want, but it's a bit playing with fire
situation, so maybe dma-buf should get the more idiot proof semantics?
> + */
> + DMA_RESV_USAGE_KERNEL,
> +
> + /**
> + * @DMA_RESV_USAGE_WRITE: Implicit write synchronization.
> + *
> + * This should only be used for userspace command submissions which add
> + * an implicit write dependency.
> + */
> + DMA_RESV_USAGE_WRITE,
> +
> + /**
> + * @DMA_RESV_USAGE_READ: Implicit read synchronization.
> + *
> + * This should only be used for userspace command submissions which add
> + * an implicit read dependency.
> + */
> + DMA_RESV_USAGE_READ,
> +
> + /**
> + * @DMA_RESV_USAGE_OTHER: No implicit sync.
> + *
> + * This should be used for operations which don't want to add an
> + * implicit dependency at all, but still have a dependency on memory
> + * management.
> + *
> + * This might include things like preemption fences as well as device
> + * page table updates or even userspace command submissions.
I think we should highlight a bit more that for explicitly synchronized
userspace like vk OTHER is the normal case. So really not an exception.
Ofc aside from amdkgf there's currently no driver doing this, but really
we should have lots of them ...
> + *
> + * The kernel memory management *always* need to wait for those fences
> + * before moving or freeing the resource protected by the dma_resv
> + * object.
> + */
> + DMA_RESV_USAGE_OTHER
> +};
> +
> +/**
> + * dma_resv_usage_rw - helper for implicit sync
> + * @write: true if we create a new implicit sync write
> + *
> + * This returns the implicit synchronization usage for write or read accesses.
Pls add "See enum dma_resv_usage for more details." or so. Never hurts to
be plentiful with links :-)
> + */
> +static inline enum dma_resv_usage dma_resv_usage_rw(bool write)
> +{
> + /* This looks confusing at first sight, but is indeed correct.
> + *
> + * The rational is that new write operations needs to wait for the
> + * existing read and write operations to finish.
> + * But a new read operation only needs to wait for the existing write
> + * operations to finish.
> + */
> + return write ? DMA_RESV_USAGE_READ : DMA_RESV_USAGE_WRITE;
> +}
> +
> /**
> * struct dma_resv - a reservation object manages fences for a buffer
> *
> @@ -147,8 +227,8 @@ struct dma_resv_iter {
> /** @obj: The dma_resv object we iterate over */
> struct dma_resv *obj;
>
> - /** @all_fences: If all fences should be returned */
> - bool all_fences;
> + /** @usage: Controls which fences are returned */
> + enum dma_resv_usage usage;
>
> /** @fence: the currently handled fence */
> struct dma_fence *fence;
> @@ -178,14 +258,14 @@ struct dma_fence *dma_resv_iter_next(struct dma_resv_iter *cursor);
> * dma_resv_iter_begin - initialize a dma_resv_iter object
> * @cursor: The dma_resv_iter object to initialize
> * @obj: The dma_resv object which we want to iterate over
> - * @all_fences: If all fences should be returned or just the exclusive one
> + * @usage: controls which fences to return
Please add the blurb here I mentioned above. Maybe adjust the text to use
the neatly highlighted @usage.
> */
> static inline void dma_resv_iter_begin(struct dma_resv_iter *cursor,
> struct dma_resv *obj,
> - bool all_fences)
> + enum dma_resv_usage usage)
> {
> cursor->obj = obj;
> - cursor->all_fences = all_fences;
> + cursor->usage = usage;
> cursor->fence = NULL;
> }
>
> @@ -242,7 +322,7 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
> * dma_resv_for_each_fence - fence iterator
> * @cursor: a struct dma_resv_iter pointer
> * @obj: a dma_resv object pointer
> - * @all_fences: true if all fences should be returned
> + * @usage: controls which fences to return
> * @fence: the current fence
> *
Same, another place that needs the @usage clarification.
> * Iterate over the fences in a struct dma_resv object while holding the
> @@ -251,8 +331,8 @@ static inline bool dma_resv_iter_is_restarted(struct dma_resv_iter *cursor)
> * valid as long as the lock is held and so no extra reference to the fence is
> * taken.
> */
> -#define dma_resv_for_each_fence(cursor, obj, all_fences, fence) \
> - for (dma_resv_iter_begin(cursor, obj, all_fences), \
> +#define dma_resv_for_each_fence(cursor, obj, usage, fence) \
> + for (dma_resv_iter_begin(cursor, obj, usage), \
> fence = dma_resv_iter_first(cursor); fence; \
> fence = dma_resv_iter_next(cursor))
>
> @@ -421,14 +501,14 @@ void dma_resv_replace_fences(struct dma_resv *obj, uint64_t context,
> void dma_resv_add_excl_fence(struct dma_resv *obj, struct dma_fence *fence);
> void dma_resv_prune(struct dma_resv *obj);
> void dma_resv_prune_unlocked(struct dma_resv *obj);
> -int dma_resv_get_fences(struct dma_resv *obj, bool write,
> +int dma_resv_get_fences(struct dma_resv *obj, enum dma_resv_usage usage,
> unsigned int *num_fences, struct dma_fence ***fences);
> -int dma_resv_get_singleton(struct dma_resv *obj, bool write,
> +int dma_resv_get_singleton(struct dma_resv *obj, enum dma_resv_usage usage,
> struct dma_fence **fence);
> int dma_resv_copy_fences(struct dma_resv *dst, struct dma_resv *src);
> -long dma_resv_wait_timeout(struct dma_resv *obj, bool wait_all, bool intr,
> - unsigned long timeout);
> -bool dma_resv_test_signaled(struct dma_resv *obj, bool test_all);
> +long dma_resv_wait_timeout(struct dma_resv *obj, enum dma_resv_usage usage,
> + bool intr, unsigned long timeout);
> +bool dma_resv_test_signaled(struct dma_resv *obj, enum dma_resv_usage usage);
> void dma_resv_describe(struct dma_resv *obj, struct seq_file *seq);
I took endless amounts of discussions, but I think we're arriving at
something really neat and tiny here now finally. Both semantics, and how
drivers use them.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
[View Less]
Am 29.11.21 um 13:23 schrieb Thomas Hellström:
> Hi, Christian,
>
> On Mon, 2021-11-29 at 09:21 +0100, Christian König wrote:
>> Am 29.11.21 um 08:35 schrieb Thomas Hellström:
>>> If a dma_fence_array is reported signaled by a call to
>>> dma_fence_is_signaled(), it may leak the PENDING_ERROR status.
>>>
>>> Fix this by clearing the PENDING_ERROR status if we return true in
>>> dma_fence_array_signaled().
>>>
>>> …
[View More]Fixes: 1f70b8b812f3 ("dma-fence: Propagate errors to dma-fence-
>>> array container")
>>> Cc: linaro-mm-sig(a)lists.linaro.org
>>> Cc: Christian König <ckoenig.leichtzumerken(a)gmail.com>
>>> Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
>>> Signed-off-by: Thomas Hellström <thomas.hellstrom(a)linux.intel.com>
>> Reviewed-by: Christian König <christian.koenig(a)amd.com>
> How are the dma-buf / dma-fence patches typically merged? If i915 is
> the only fence->error user, could we take this through drm-intel to
> avoid a backmerge for upcoming i915 work?
Well that one here looks like a bugfix to me, so either through
drm-misc-fixes ore some i915 -fixes branch sounds fine to me.
If you have any new development based on that a backmerge of the -fixes
into your -next branch is unavoidable anyway.
Regards,
Christian.
>
> /Thomas
>
>
>>> ---
>>> drivers/dma-buf/dma-fence-array.c | 6 +++++-
>>> 1 file changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/dma-buf/dma-fence-array.c b/drivers/dma-
>>> buf/dma-fence-array.c
>>> index d3fbd950be94..3e07f961e2f3 100644
>>> --- a/drivers/dma-buf/dma-fence-array.c
>>> +++ b/drivers/dma-buf/dma-fence-array.c
>>> @@ -104,7 +104,11 @@ static bool dma_fence_array_signaled(struct
>>> dma_fence *fence)
>>> {
>>> struct dma_fence_array *array = to_dma_fence_array(fence);
>>>
>>> - return atomic_read(&array->num_pending) <= 0;
>>> + if (atomic_read(&array->num_pending) > 0)
>>> + return false;
>>> +
>>> + dma_fence_array_clear_pending_error(array);
>>> + return true;
>>> }
>>>
>>> static void dma_fence_array_release(struct dma_fence *fence)
>
[View Less]