Re: [Linaro-mm-sig] [RFC v2] dma-buf: Add buffer sharing framework

30 Sep 2011

      Hi Daniel,
On 09/28/2011 04:22 PM, Daniel Vetter wrote:
...
On Wed, Sep 28, 2011 at 10:51:08AM +0200, Hans Verkuil wrote:
...
On Tuesday, September 27, 2011 16:19:56 Daniel Vetter wrote:
...
Hi Hans,
I'll try to explain a bit, after all I've been pushing this attachment
buisness quite a bit.
On Tue, Sep 27, 2011 at 03:24:24PM +0200, Hans Verkuil wrote:
...
OK, it is not clear to me what the purpose is of the attachments.
If I understand the discussion from this list correctly, then the idea is
that each device that wants to use this buffer first attaches itself by
calling dma_buf_attach().
Then at some point the application asks some driver to export the buffer.
So the driver calls dma_buf_export() and passes its own dma_buf_ops. In
other words, this becomes the driver that 'controls' the memory, right?
Actually, the ordering is the other way round. First, some driver calls
dam_buf_export, userspace then passes around the fd to all other drivers,
they do an import and call attach. While all this happens, the driver that
exported the dma_buf does not (yet) allocate any backing storage.
Ah. This really should be documented better :-)
...
...
Another driver that receives the fd will call dma_buf_get() and can then
call e.g. get_scatterlist from dma_buf->ops. (As an aside: I would make
inline functions that take a dma_buf pointer and call the corresponding
op, rather than requiring drivers to go through ops directly)
Well, drivers should only call get_scatterlist when they actually need to
access the memory. This way the originating driver can go through the list
of all attached devices and decide where to allocate backing storage on
the first get_scatterlist.
Time is something I am now worried with above approach. In Linaro 
connect in Cambridge, I had raised an idea of drivers registering their 
memory requirement capabilities at their startup (and vice-versa). I 
will try to elaborate it again below:
Driver A and Driver B wish to share the same set of buffer pool. For 
some reason both have different buffer allocation requirements (one 
needs contiguous allocation and other can use its own IOMMU to manage 
the buffer chunks) When driver A probe is called, it will *also* 
register its capabilities with some buffer bank (we can think of 
extending vb2 in case of v4l2). Similarly the driver B will register its 
capabilities. Where to keep this piece of meta-data, I am not sure at 
this point of time. When the first allocation call happens (from 
application, which may call A or B in whatever order), the allocator 
will check the common set of buffer requirements and then make 
appropriate allocation, going through all the drivers buffer capability 
information (which should be quick). Even if the application asks the 
drivers to release buffers and try to request again after some time, the 
allocation will not take much time, as the meta-data will be ready with 
the (central)buffer bank. However saying that, I am again not sure how 
to consider scenario when the buffer capabilities of both devices dont 
match at all. May be, such cases we can take care during the driver 
design itself.
For above, I was thinking of some multimedia application like camera. 
User may open the camera app, which starts showing preview. He may 
choose to close it and then immediately reopen it for his crazy reasons. 
If we take too much time just to decide the kind of buffer to allocate, 
the start-up time of the application will be too much.
...
...
Unless I am mistaken there is nothing there in the attachment datastructure
at this moment to help determine the best device to use for the scatterlist.
Right? It's just a list of devices together with an opaque pointer.
Well, I want a struct device pointer in there, not something opaque. But
yes, an attachment is just bookkeeping.
...
Another thing I don't get (sorry, I must be really dense today) is that the
get_scatterlist op is set by the dma_buf_export call. I would expect that
that op is part of the attachment data since I would expect that to be
device specific.
There's a dma_buf_attachment parameter so you can get at the device the
mapping is for.
...
At least, my assumption is that the actual memory is allocated by the first
call to get_scatterlist?
Yes.
...
Actually, I think this might be key: who allocates the actual memory and
when? There is no documentation whatsoever on this crucial topic.
Ok, let me elaborate a bit on how I see this. The memory will be allocated
by whichever driver exported the dma_buf originally. We probably end up
with a bunch of helper fucntion for the common case, but this approach
allows crazy stuff like sharing special memory (e.g. TILER on omap) which
can only be accessed by certain devices. I think this is easieast with a
few examples:

On a sane hw, e.g. buffer exported by i915 on x86: attach is a noop (and
 hence should be optional), in get_scatterlist I'll just create a sg_list
 out of the struct page * array I get from the gem backing storage and
 the call map_sg and return that sg_list (now mapped into the device
 address space).

On insane hw with strange dma requirements (like dram bank placement,
 contigious, hugepagesize for puny tlbs, ...): Attach could check whether
 the buffer is already fixed or not (e.g. the kernel fb console probably
 can't be moved) and if the fixed buffer could be mapped by the device.
 That's we attach needs a return value. If the buffer is not pinned or
 not yet allocated, do nothing.
On get_scatterlist walk the attachment list and grab the dma
 requirements of all devices (via some platform/dma specific stuff
 attached to struct device). Then walk through a list of dma allocators
 your platform provides (page_alloc, cma, ...) until you have one that
 satisfies the requirements and alloc the backing storage.
Then create an sg_list out of hit and map_sg it, like above.
For arm we probably need some helper functions to make this easier.

Sharing of a special buffer object, like a tiled buffer remapping into
 omap's TILER. attach would check whether the device can actually access
 that special io area (iirc on omap only certain devices can access the
 tiler). get_scatterlist would just create a 1-entry sg_list that points
 directly to the pyhsical address (in device space) of the remapped
 buffer. This is way I want get_scatterlist to return a mapped buffer
 (and for api cleanliness).

Cheers, Daniel
Regards,
Subash

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] [RFC v2] dma-buf: Add buffer sharing framework