On Fri, Sep 2, 2011 at 10:21 AM, Clark, Rob rob@ti.com wrote:
On Fri, Sep 2, 2011 at 5:24 AM, Daniel Vetter daniel@ffwll.ch wrote:
On Thu, Sep 01, 2011 at 01:11:51PM -0500, Clark, Rob wrote:
You've probably discussed this all at the recent summit I've totally missed, but why not add a struct device * pointer argument to get_scatterlist? This way we could shovel all the platform specific complexity into platform specific code.
Ie. add in struct device_dma_parameters?
Well, I guess we still need a way to describe "weird" requirements.. but I guess having the right place to describe them is a good start.
I think that struct device is the right place because the iommu code already needs it there, so why not reuse it for the shared buffer layout arbitrage? At least that was my thinking.
yup, seams like a sane approach
For devices that have different dma requirements depending upon function (i.e. contigous for scanout, iommu-mapped with large pages for overlay image) we could use a bunch of subdevices to map all the different requirements.
Another similar idea could also solve the backing storage negotiation problem: A dma_buf object is at first just an abstract handle (with a fixed size), then you can attach users with dma_buf_attach_device. Then, after all users have themselves attached, the first get_scatterlist call actually allocates the backing storage for the buffer. A subsequent attach_device would then return -EBUSY.
The tricky thing is that we can't rely on all the devices seeing the buffer before the buffer starts getting used. Think video record app.. user launches app and it starts in preview mode. Then they select which format they want to encode to (ex. mpeg4 vs h264). Then they hit 'record'. And finally *now* it starts passing buffers to the encoder, the same buffers that were already circulating between the display and camera.
Well.. it is a hypothetical issue.. if the camera already has the same requirements as the encoder on some platform, then this wouldn't be a problem. So may or may not need to be solved in the first version. Certainly deferring allocating of backing pages helps. (It was also an idea Arnd had.)
I see two ways out of this:
- Either drop the current set of buffers and create a new set that can
also be shared with the encoder.
- Or attach all devices that might ever need to touch that buffer before
the first use. But if we go with my 2-step scheme, that's strictly a userspace issue: In both cases we need to export a new attach method in every subsystem that uses shared buffers (the dma_buf_map is probably implicit on first real usage).
But is it a solvable userspace problem? w/ something like GStreamer you have a lot of flexibility for dynamic pipeline changes. AFAIK apps like cheese utilize this. Ideally we wouldn't suddenly introduce new restrictions that make things break w/ hw accel path, compared to how userspace developers are developing w/ sw based encoders.
That said, I guess it is likely that different hw encoders would have same restrictions. So the 2-step approach seems like a pragmatic solution that would handle most cases, and maybe the cases that it doesn't handle are only hypothetical.
oh, and if the importing driver is bracketing DMA to/from the buffer w/ get/put scatter list, it should even be possible to implement more fancy reallocation or migration stuff later if it is actually needed, without changing dmabuf API. Or at least this is my theory.
The restriction is that get_sglist needs to be called only from a context that could block.
(And maybe needs to return some hint to the importing driver that the buffer location has/hasn't changed? I'm not sure.. if we are returning device addresses thanks to iommu and having passed the 'struct device' ptr to get_sglist(), then all the optimizations of caching the mapping of the buffer is all behind the scenes?)
BR, -R
Imho the appeal of this is that there's no preferred party of a shared buffer (the one the buffer originated from and allocated it), but that is handled by the dma core (and some platform specific magic for really weird cases).
We could even go so far and make the dma_buf creation a real syscall instead of shoving it into some random ioctls.
hmm, I don't quite get why making it a syscall would help..
BR, -R
Cheers, Daniel
Daniel Vetter Mail: daniel@ffwll.ch Mobile: +41 (0)79 365 57 48