Hi!
I'm just looking over what's needed to implement drm Prime / dma-buf exports + imports in the vmwgfx driver. It seems like most dma-bufs ops are quite straightforward to implement except user-space mmap().
The reason being that vmwgfx dma-bufs will be using completely non-coherent memory, whenever there needs to be CPU accesses.
The accelerated contents resides in an opaque structure on the device into which we can DMA to and from, so for mmap to work we need to zap ptes and DMA to the device when doing something accelerated, and on the first page-fault DMA data back and wait for idle if the device did a write to the dma-buf.
Now this shouldn't really be a problem if dma-bufs were only used for cross-device sharing, but since people apparently want to use dma-buf file handles to share CPU data between processes it really becomes a serious problem.
Needless to say we'd want to limit the size of the DMAs, and have mmap users request regions for read, and mark regions dirty for write, something similar to gallium's texture transfer objects.
Any ideas?
/Thomas