Am 05.09.22 um 18:39 schrieb Tvrtko Ursulin:
On 05/09/2022 12:21, Christian König wrote:
Am 05.09.22 um 12:56 schrieb Arvind Yadav:
The core DMA-buf framework needs to enable signaling before the fence is signaled. The core DMA-buf framework can forget to enable signaling before the fence is signaled. To avoid this scenario on the debug kernel, check the DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT status bit before checking the signaling bit status to confirm that enable_signaling is enabled.
You might want to put this patch at the end of the series to avoid breaking the kernel in between.
Signed-off-by: Arvind Yadav Arvind.Yadav@amd.com
include/linux/dma-fence.h | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 775cdc0b4f24..60c0e935c0b5 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -428,6 +428,11 @@ dma_fence_is_signaled_locked(struct dma_fence *fence) static inline bool dma_fence_is_signaled(struct dma_fence *fence) { +#ifdef CONFIG_DEBUG_FS
CONFIG_DEBUG_FS is certainly wrong, probably better to check for CONFIG_DEBUG_WW_MUTEX_SLOWPATH here.
Apart from that looks good to me,
What's the full story in this series - I'm afraid the cover letter does not make it clear to a casual reader like myself? Where does the difference between debug and non debug kernel come from?
We have a bug that the drm_sync file doesn't properly enable signaling leading to an igt test failure.
And how do the proposed changes relate to the following kerneldoc excerpt:
* Since many implementations can call dma_fence_signal() even when before * @enable_signaling has been called there's a race window, where the * dma_fence_signal() might result in the final fence reference being * released and its memory freed. To avoid this, implementations of this * callback should grab their own reference using dma_fence_get(), to be * released when the fence is signalled (through e.g. the interrupt * handler). * * This callback is optional. If this callback is not present, then the * driver must always have signaling enabled.
Is it now an error, or should be impossible condition, for "is signaled" to return true _unless_ signaling has been enabled?
That's neither an error nor impossible. For debugging we just never return signaled from the dma_fence_is_signaled() function when signaling was not enabled before.
I also plan to remove the return value from the enable_signaling callback. That was just not very well designed.
If the statement (in a later patch) is signalling should always be explicitly enabled by the callers of dma_fence_add_callback, then what about the existing call to __dma_fence_enable_signaling from dma_fence_add_callback?
Oh, good point. That sounds like we have some bug in the core dma_fence code as well.
Calls to dma_fence_add_callback() and dma_fence_wait() should enable signaling implicitly and don't need an extra call for that.
Only dma_fence_is_signaled() needs this explicit enabling of signaling through dma_fence_enable_sw_signaling().
Or if the rules are changing shouldn't kerneldoc be updated as part of the series?
I think the kerneldoc is just a bit misleading. The point is that when you need to call dma_fence_enable_sw_signaling() you must hold a reference to the fence object.
But that's true for all the dma_fence_* functions. The race described in the comment is just nonsense because you need to hold that reference anyway.
Regards, Christian.
Regards,
Tvrtko
Christian.
+ if (!test_bit(DMA_FENCE_FLAG_ENABLE_SIGNAL_BIT, &fence->flags)) + return false; +#endif
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) return true;