Re: [Linaro-mm-sig] Wayland generic dmabuf protocol

12 Jun 2014


      On Thu, Jun 12, 2014 at 2:01 AM, Pekka Paalanen
pekka.paalanen@collabora.co.uk wrote:
...
On Wed, 11 Jun 2014 12:00:57 -0400
Rob Clark robdclark@gmail.com wrote:
...
On Mon, Jun 9, 2014 at 8:44 AM, Pekka Paalanen
pekka.paalanen@collabora.co.uk wrote:
...
On Mon, 9 Jun 2014 12:23:18 +0100
Daniel Stone daniel@fooishbar.org wrote:
...
Hi,
On 9 June 2014 12:06, Pekka Paalanen pekka.paalanen@collabora.co.uk wrote:
...
On Mon, 9 Jun 2014 11:00:04 +0200
Benjamin Gaignard benjamin.gaignard@linaro.org wrote:
...
One of the main comment on the latest patches was that wl_dmabuf use
DRM for buffer allocation.
This appear to be an issue since wayland doesn't want to rely on one
specific framework (DRM, or V4L2) for buffer allocation, so we have
start working on a "central dmabuf allocation" on kernel side. The
goal is provide some as generic as possible to make it acceptable by
wayland.
Why would Wayland need a central allocator for dmabuf?
I think you've just answered your own question further below:
...
...
On my hardware the patches you have (+ this one on gstwaylandsink
https://bugzilla.gnome.org/show_bug.cgi?id=711155) allow me to do zero
copy between the hardware video decoder and the display engine. I
don't have implemented GPU yet because my hardware is able to do
compose few video overlays planes and it was enough for my tests.
Right.
What I have been thinking is, that the compositor must be able to use
the new wl_buffer and we need to guarantee that before-hand. If the
compositor fails to use a wl_buffer when the client has already
attached it to a wl_surface and it is time to repaint, it is too late
and the user will see a glitch. Recovering from that requires asking
the client to provide a new wl_buffer of a different kind, which might
take time. Or a very rude compositor would just send a protocol error,
and then we'd get bug reports like "the video player just disappears
when I try to play (and ps. I have an old kernel that doesn't support
importing whatever)".
I believe we must allow the compositor to test the wl_buffer before it
is usable for the client. That is the reason for the roundtrippy design
of the below proposal.
A central allocator would solve these issues, by having everyone agree on
the restrictions upfront, instead of working out which of the media decode
engine, camera, GPU, or display controller is the lowest common
denominator, and forcing all allocations through there.
One such solution was discussed a while back WRT ION:
https://lwn.net/Articles/565469/
See the 'possible solutions' part for a way for people to agree on
restrictions wrt tiling, stride, contiguousness, etc.
Hi,
that's an excellent article. I didn't know that delayed allocation of
dmabufs was not even possible yet, which would have allowed us to
not think about importing failures and simply let the client fall back
with "ok, don't use dmabuf with this particular device then".
hrm?  I know of at least a couple drm drivers that defer allocation of
backing pages..
I came a bit harsh there. So it is possible, and few drivers might even
do it already, but is there even an intention of requiring all drivers
to be able to defer allocation?
not sure I'd go as far as to require it, but it is a pretty silly
optimization to skip..
...
Though if migration is going to work, the only downside of not doing
deferred allocation would be a performance penalty in the beginning,
right?
right
...
...
...
What is the conclusion here?
Wayland protocol does not need to consider import failures at all, and
can simply punt those as protocol errors, which essentially kill the app
if they ever happen?
Do we need to wait for the central allocator in kernel to materialize
before we can design the protocol? Is it simply too early to try to do
it now?
I do tend to think the ION/central-allocator is just substituting one
problem for another.  It doesn't really solve the problem of how
different devices which don't actually know each other can decide on
buffers that they can share.  On an phone/tablet/etc you know up front
when building the kernel what devices there are and in what uses-cases
they will be used, etc.  But that isn't really solving the more
general case.
Right, as I have been following the PC side in the past a lot more than
ARM or embedded, a central allocator seemed a little strange as the
final solution to me too.
...
...
Was the idea of dmabuf in-kernel constraint negotiation with delayed
allocation rejected in favour of a central allocator?
not really, that I know of.  I still think we need to spiff out
dma-mapping to better handle placement constraints.  (Although still
prefer format constraints to be a userspace topic.)
Sure. What I specifically am interested in, which all things would be
left for user space to control and match, as that would affect the
Wayland protocol for dmabufs via APIs like GBM and V4L.
I try to divide buffer constraints into two categories:
1) placement, ie. where the actual pages go (contiguous, special
memory range, etc)
2) format (fourcc, tiling format, pitch restrictions)
For most (all?) of the drm drivers, at the GEM level we do not
necessarily have any information about category #2.  All the kernel
cares about is category #1 in most cases.
Also, in at least some cases (gstreamer is a good example), there is
already a mechanism in place for negotiating #2.
This is my reasoning behind the conclusion that dmabuf (and kernel
level APIs) should care about #1, and userspace should care about #2.
...
...
pengutronix is doing some work in this area:
http://elinux.org/images/b/b0/OSELAS.Presentation-DMABUF-migration.pdf
That is cool, and it also tells me that it is ok for the initial dmabuf
sharing and creating a wl_buffer protocol object to be expensive
(require one roundtrip per batch of buffers), as the setup may involve
migration even in a good case and buffer re-use is heavily recommended.
This brings a question in my mind.
A Wayland compositor must be able to use a dmabuf-based wl_buffer for
at least its fallback compositing path, let's say GLESv2 and we are
able to directly texture from the dmabuf. Then the compositor sees an
opportunity to promote the surface to a hardware overlay, and attempts
to, say, import the dmabuf a second time as a DRM FB. If it is not
possible to satisfy all of exporter, EGL-import and DRM-import
restrictions at the same time, and especially if exporter vs. DRM-import
would cause ping-ponging, it would be better to just let the DRM-import
fail, and continue with GLESv2 compositing.
Would you agree?
Could dmabuf related interfaces somehow allow for the user space to
choose how much pain is tolerable for the import to succeed?
hmm, this is actually an interesting idea.  So far the assumption has
been that, if you could not actually share buffers between devices
that userspace would do something different.
It seems like it would be worthwhile for userspace to know how
expensive sharing will be vs just using the window surface as a
texture..
BR,
-R
...
Thanks,
pq

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [Linaro-mm-sig] Wayland generic dmabuf protocol