On Fri, Mar 16, 2018 at 02:20:44PM +0100, Christian König wrote:
Hi everybody,
since I've got positive feedback from Daniel I continued working on this approach.
A few issues are still open:
- Daniel suggested that I make the invalidate_mappings callback a parameter of dma_buf_attach().
This approach unfortunately won't work because when the attachment is created the importer is not necessarily ready to handle invalidation events.
Why do you have this constraint? This sounds a bit like inverted create/teardown sequence troubles, where you make an object "life" before the thing is fully set up.
Can't we fix this by creating the entire ttm scaffolding you'll need for a dma-buf upfront, and only once you have everything we grab the dma_buf attachment? At that point you really should be able to evict buffers again.
Not requiring invalidate_mapping to be set together with the attachment means we can't ever require importers to support it (e.g. to address your concern with the userspace dma-buf userptr magic).
E.g. in the amdgpu example we first need to setup the imported GEM/TMM objects and install that in the attachment.
My solution is to introduce a separate function to grab the locks and set the callback, this function could then be used to pin the buffer later on if that turns out to be necessary after all.
- With my example setup this currently results in a ping/pong situation
because the exporter prefers a VRAM placement while the importer prefers a GTT placement.
This results in quite a performance drop, but can be fixed by a simple mesa patch which allows shred BOs to be placed in both VRAM and GTT.
Question is what should we do in the meantime? Accept the performance drop or only allow unpinned sharing with new Mesa?
Maybe the exporter should not try to move stuff back into VRAM as long as there's an active dma-buf? I mean it's really cool that it works, but maybe let's just do this for a tech demo :-)
Of course if it then runs out of TT then it could still try to move it back in. And "let's not move it when it's imported" is probably too stupid too, and will need to be improved again with more heuristics, but would at least get it off the ground.
Long term you might want to move perhaps once per 10 seconds or so, to get idle importers to detach. Adjust 10s to match whatever benchmark/workload you care about. -Daniel