On Fri, 2024-06-28 at 20:06 +0200, Daniel Vetter wrote:
On Thu, Jun 27, 2024 at 02:18:44PM +0200, Thomas Hellström wrote:
On Thu, 2024-06-27 at 10:04 +0200, Daniel Vetter wrote:
On Wed, Jun 26, 2024 at 05:58:02PM +0200, Thomas Hellström wrote:
Hi!
I'm seeing the below lockdep splat 1) with the xe driver in an imported dma-buf object destruction path.
It's not because we hold the dma_resv lock at that point, but rather because we hold *another* dma_resv lock at that point, and the dma_resv detach happens when the object is idle, in this case it was idle at the final put(), and dma_buf_detach() is called in the putting process.
Holding another dma-buf lock might happen as part of drm_exec_unlock_all, or simply if the wider vm dma_resv was held at object put time, so it's not an uncommon pattern, even if the drm_exec instance can be fixed by putting all bos after unlocking them all.
Two solutions coming to mind here:
- Provide a dma_buf_detach_locked()
This smells way too much like the endless headaches we had with drm_gem_object_put_locked and friends against drm_device.struct_mutex. Or I'm not understanding what you're doing, because I'm pretty sure you have to take the dma_resv lock on final put() of imported objects. Because that final put() is of the import wrapper, the exporter (and other importers) can still get at that object and so dma_resv_lock is very much needed.
Yeah, the TTM final put looks like
if (!dma_resv_trylock() || !idle) queue_work(final_distruction);
dma_resv_unlock(); dma_buf_detach(); <--- lockdep splat
Here's where a dma_buf_detach_locked() would've made sense before the dma_resv_unlock().
But if you think this will cause grief, I'm completely fine with fixing this in TTM by always taking the deferring path.
Oh I misunderstood what you've meant, I thought you want to do a huge exercise in passing the "do we know we're locked" flag all the way through entire callchains to exporters.
If it's just so that the fastpath of bypassing the worker can function for imported buffers, then I think that's fine. As long as we just punt to the worker if we can't get the lock.
OK, TBH, the driver would need a drm_prime_gem_destroy_locked() as well since that's the function that calls dma_buf_detach. But TBH I think it's worth it anyway since if we just modify TTM to always take the delayed destruction path, I figure much code will come to depend on it and it will be invasive to update.
I'll take a quick stab a that to see how ugly it becomes.
/Thomas
-Sima