Following a recent discussion at last Plumbers, John Stultz, Sumit
Sewal, TJ Mercier and I came to an agreement that we should document
what the dma-buf heaps names are expected to be, and what the buffers
attributes you'll get should be documented.
Let's create that doc to make sure those attributes and names are
guaranteed going forward.
Reviewed-by: T.J. Mercier <tjmercier(a)google.com>
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes from v3:
- Add TJ RB
- Fix a few typos.
Changes from v2:
- Remove exhaustive list of names for platforms, and just mention the
alternatives.
- Add MAINTAINERS entry
Changes from v1:
- Add the mention that the cma / reserved heap is optional.
---
Documentation/userspace-api/dma-buf-heaps.rst | 25 +++++++++++++++++++
Documentation/userspace-api/index.rst | 1 +
MAINTAINERS | 1 +
3 files changed, 27 insertions(+)
create mode 100644 Documentation/userspace-api/dma-buf-heaps.rst
diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst
new file mode 100644
index 000000000000..535f49047ce6
--- /dev/null
+++ b/Documentation/userspace-api/dma-buf-heaps.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Allocating dma-buf using heaps
+==============================
+
+Dma-buf Heaps are a way for userspace to allocate dma-buf objects. They are
+typically used to allocate buffers from a specific allocation pool, or to share
+buffers across frameworks.
+
+Heaps
+=====
+
+A heap represents a specific allocator. The Linux kernel currently supports the
+following heaps:
+
+ - The ``system`` heap allocates virtually contiguous, cacheable, buffers.
+
+ - The ``cma`` heap allocates physically contiguous, cacheable,
+ buffers. Only present if a CMA region is present. Such a region is
+ usually created either through the kernel commandline through the
+ `cma` parameter, a memory region Device-Tree node with the
+ `linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or
+ `CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it
+ might be called ``reserved``, ``linux,cma``, or ``default-pool``.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index b1395d94b3fd..9cbe4390c872 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -42,10 +42,11 @@ Devices and I/O
.. toctree::
:maxdepth: 1
accelerators/ocxl
+ dma-buf-heaps
dma-buf-alloc-exchange
gpio/index
iommufd
media/index
dcdbas
diff --git a/MAINTAINERS b/MAINTAINERS
index 8e0736dc2ee0..ef0364e65aef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6911,10 +6911,11 @@ R: T.J. Mercier <tjmercier(a)google.com>
L: linux-media(a)vger.kernel.org
L: dri-devel(a)lists.freedesktop.org
L: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers)
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
+F: Documentation/userspace-api/dma-buf-heaps.rst
F: drivers/dma-buf/dma-heap.c
F: drivers/dma-buf/heaps/*
F: include/linux/dma-heap.h
F: include/uapi/linux/dma-heap.h
F: tools/testing/selftests/dmabuf-heaps/
--
2.48.1
Following a recent discussion at last Plumbers, John Stultz, Sumit
Sewal, TJ Mercier and I came to an agreement that we should document
what the dma-buf heaps names are expected to be, and what the buffers
attributes you'll get should be documented.
Let's create that doc to make sure those attributes and names are
guaranteed going forward.
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes from v2:
- Remove exhaustive list of names for platforms, and just mention the
alternatives.
- Add MAINTAINERS entry
Changes from v1:
- Add the mention that the cma / reserved heap is optional.
---
Documentation/userspace-api/dma-buf-heaps.rst | 25 +++++++++++++++++++
Documentation/userspace-api/index.rst | 1 +
MAINTAINERS | 1 +
3 files changed, 27 insertions(+)
create mode 100644 Documentation/userspace-api/dma-buf-heaps.rst
diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst
new file mode 100644
index 000000000000..5b92d69646f6
--- /dev/null
+++ b/Documentation/userspace-api/dma-buf-heaps.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Allocating dma-buf using heaps
+==============================
+
+Dma-buf Heaps are a way for userspace to allocate dma-buf objects. They are
+typically used to allocate buffers from a specific allocation pool, or to share
+buffers across frameworks.
+
+Heaps
+=====
+
+A heap represent a specific allocator. The Linux kernel currently supports the
+following heaps:
+
+ - The ``system`` heap allocates virtually contiguous, cacheable, buffers
+
+ - The ``cma`` heap allocates physically contiguous, cacheable,
+ buffers. Only present if a CMA region is present. Such a region is
+ usually created either through the kernel commandline through the
+ `cma` parameter, a memory region Device-Tree node with the
+ `linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or
+ `CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it
+ might be called ``reserved``, ``linux,cma``, or ``default-pool``.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index b1395d94b3fd..9cbe4390c872 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -42,10 +42,11 @@ Devices and I/O
.. toctree::
:maxdepth: 1
accelerators/ocxl
+ dma-buf-heaps
dma-buf-alloc-exchange
gpio/index
iommufd
media/index
dcdbas
diff --git a/MAINTAINERS b/MAINTAINERS
index 8e0736dc2ee0..ef0364e65aef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6911,10 +6911,11 @@ R: T.J. Mercier <tjmercier(a)google.com>
L: linux-media(a)vger.kernel.org
L: dri-devel(a)lists.freedesktop.org
L: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers)
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
+F: Documentation/userspace-api/dma-buf-heaps.rst
F: drivers/dma-buf/dma-heap.c
F: drivers/dma-buf/heaps/*
F: include/linux/dma-heap.h
F: include/uapi/linux/dma-heap.h
F: tools/testing/selftests/dmabuf-heaps/
--
2.48.1
On Fri, 14 Feb 2025 at 21:19, Boris Brezillon
<boris.brezillon(a)collabora.com> wrote:
>
> On Fri, 14 Feb 2025 18:37:14 +0530
> Sumit Garg <sumit.garg(a)linaro.org> wrote:
>
> > On Fri, 14 Feb 2025 at 15:37, Jens Wiklander <jens.wiklander(a)linaro.org> wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Feb 13, 2025 at 6:39 PM Daniel Stone <daniel(a)fooishbar.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Thu, 13 Feb 2025 at 15:57, Jens Wiklander <jens.wiklander(a)linaro.org> wrote:
> > > > > On Thu, Feb 13, 2025 at 3:05 PM Daniel Stone <daniel(a)fooishbar.org> wrote:
> > > > > > But just because TEE is one good backend implementation, doesn't mean
> > > > > > it should be the userspace ABI. Why should userspace care that TEE has
> > > > > > mediated the allocation instead of it being a predefined range within
> > > > > > DT?
> > > > >
> > > > > The TEE may very well use a predefined range that part is abstracted
> > > > > with the interface.
> > > >
> > > > Of course. But you can also (and this has been shipped on real
> > > > devices) handle this without any per-allocation TEE needs by simply
> > > > allocating from a memory range which is predefined within DT.
> > > >
> > > > From the userspace point of view, why should there be one ABI to
> > > > allocate memory from a predefined range which is delivered by DT to
> > > > the kernel, and one ABI to allocate memory from a predefined range
> > > > which is mediated by TEE?
> > >
> > > We need some way to specify the protection profile (or use case as
> > > I've called it in the ABI) required for the buffer. Whether it's
> > > defined in DT seems irrelevant.
> > >
> > > >
> > > > > > What advantage
> > > > > > does userspace get from having to have a different codepath to get a
> > > > > > different handle to memory? What about x86?
> > > > > >
> > > > > > I think this proposal is looking at it from the wrong direction.
> > > > > > Instead of working upwards from the implementation to userspace, start
> > > > > > with userspace and work downwards. The interesting property to focus
> > > > > > on is allocating memory, not that EL1 is involved behind the scenes.
> > > > >
> > > > > From what I've gathered from earlier discussions, it wasn't much of a
> > > > > problem for userspace to handle this. If the kernel were to provide it
> > > > > via a different ABI, how would it be easier to implement in the
> > > > > kernel? I think we need an example to understand your suggestion.
> > > >
> > > > It is a problem for userspace, because we need to expose acceptable
> > > > parameters for allocation through the entire stack. If you look at the
> > > > dmabuf documentation in the kernel for how buffers should be allocated
> > > > and exchanged, you can see the negotiation flow for modifiers. This
> > > > permeates through KMS, EGL, Vulkan, Wayland, GStreamer, and more.
> > >
> > > What dma-buf properties are you referring to?
> > > dma_heap_ioctl_allocate() accepts a few flags for the resulting file
> > > descriptor and no flags for the heap itself.
> > >
> > > >
> > > > Standardising on heaps allows us to add those in a similar way.
> > >
> > > How would you solve this with heaps? Would you use one heap for each
> > > protection profile (use case), add heap_flags, or do a bit of both?
>
> I would say one heap per-profile.
>
And then it would have a per vendor multiplication factor as each
vendor enforces memory restriction in a platform specific manner which
won't scale.
> >
> > Christian gave an historical background here [1] as to why that hasn't
> > worked in the past with DMA heaps given the scalability issues.
> >
> > [1] https://lore.kernel.org/dri-devel/e967e382-6cca-4dee-8333-39892d532f71@gmai…
>
> Hm, I fail to see where Christian dismiss the dma-heaps solution in
> this email. He even says:
>
> > If the memory is not physically attached to any device, but rather just
> memory attached to the CPU or a system wide memory controller then
> expose the memory as DMA-heap with specific requirements (e.g. certain
> sized pages, contiguous, restricted, encrypted, ...).
I am not saying Christian dismissed DMA heaps but rather how
scalability is an issue. What we are proposing here is a generic
interface via TEE to the firmware/Trusted OS which can perform all the
platform specific memory restrictions. This solution will scale across
vendors.
>
> >
> > >
> > > > If we
> > > > have to add different allocation mechanisms, then the complexity
> > > > increases, permeating not only into all the different userspace APIs,
> > > > but also into the drivers which need to support every different
> > > > allocation mechanism even if they have no opinion on it - e.g. Mali
> > > > doesn't care in any way whether the allocation comes from a heap or
> > > > TEE or ACPI or whatever, it cares only that the memory is protected.
> > > >
> > > > Does that help?
> > >
> > > I think you're missing the stage where an unprotected buffer is
> > > received and decrypted into a protected buffer. If you use the TEE for
> > > decryption or to configure the involved devices for the use case, it
> > > makes sense to let the TEE allocate the buffers, too. A TEE doesn't
> > > have to be an OS in the secure world, it can be an abstraction to
> > > support the use case depending on the design. So the restricted buffer
> > > is already allocated before we reach Mali in your example.
> > >
> > > Allocating restricted buffers from the TEE subsystem saves us from
> > > maintaining proxy dma-buf heaps.
>
> Honestly, when I look at dma-heap implementations, they seem
> to be trivial shells around existing (more complex) allocators, and the
> boiler plate [1] to expose a dma-heap is relatively small. The dma-buf
> implementation, you already have, so we're talking about a hundred
> lines of code to maintain, which shouldn't be significantly more than
> what you have for the new ioctl() to be honest.
It will rather be redundant vendor specific code under DMA heaps
calling into firmware/Trusted OS to enforce memory restrictions as you
can look into Mediatek example [1]. With TEE subsystem managing that
it won't be the case as we will provide a common abstraction for the
communication with underlying firmware/Trusted OS.
[1] https://lore.kernel.org/linux-arm-kernel/20240515112308.10171-1-yong.wu@med…
> And I'll insist on what
> Daniel said, it's a small price to pay to have a standard interface to
> expose to userspace. If dma-heaps are not used for this kind things, I
> honestly wonder what they will be used for...
Let's try not to forcefully find a use-case for DMA heaps when there
is a better alternative available. I am still failing to see why you
don't consider following as a standardised user-space interface:
"When user-space has to work with restricted memory, ask TEE device to
allocate it"
-Sumit
Hi
Am 21.02.25 um 16:51 schrieb andriy.shevchenko(a)linux.intel.com:
> On Fri, Feb 21, 2025 at 11:36:00AM +0000, Aditya Garg wrote:
>> From: Kerem Karabay <kekrby(a)gmail.com>
>>
>> Add XRGB8888 emulation helper for devices that only support BGR888.
> ...
>
>> + for (x = 0; x < pixels; x++) {
>> + pix = le32_to_cpu(sbuf32[x]);
>> + /* write red-green-blue to output in little endianness */
>> + *dbuf8++ = (pix & 0x00ff0000) >> 16;
>> + *dbuf8++ = (pix & 0x0000ff00) >> 8;
>> + *dbuf8++ = (pix & 0x000000ff) >> 0;
> put_unaligned_be24()
I'm all for sharing helper code, but maybe not here.
- DRM pixel formats are always little endian.
- CPU encoding is LE or BE.
- Pixel-component order can be anything: RGB/BGR/etc.
So the code has a comment to explain what happens here. Adding that call
with the _be24 postfix into the mix would make it more confusing.
>
>> + }
> ...
>
>> + static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = {
>> + 3,
>> + };
> One line?
>
> static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 3 };
I'd be ok, if there's a string preference in the kernel to use thins
style. Most of DRM doesn't AFAIK.
Best regards
Thomas
>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
Hi
Am 22.02.25 um 10:07 schrieb Aditya Garg:
>> What padding, please? Why TCP UAPI headers do not have these attributes?
>> Think about it, and think about what actually __packed does and how it affects
>> (badly) the code generation. Otherwise it looks like a cargo cult.
>>
>>> I tried removing __packed btw and driver no longer works.
>> So, you need to find a justification why. But definitely not due to padding in
>> many of them. They can go without __packed as they are naturally aligned.
> Alright, I did some debugging, basically printk sizeof(struct). Did it for both packed and unpacked with the following results:
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header is 16
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header_unpacked is 16
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header is 20
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header_unpacked is 20
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request is 32
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request_unpacked is 32
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information is 65
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information_unpacked is 68
In the unpacked version, there is a 3-byte gap after the
'bits_per_pixel' to align the next field. Using __packed removes those
gaps at the expense of runtime overhead.
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame is 12
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame_unpacked is 12
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer is 80
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer_unpacked is 80
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request is 48
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_unpacked is 48
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_response is 40
> Feb 22 13:02:04 MacBook kernel: size of struct appletbdrm_fb_request_response_unpacked is 40
>
> So, the difference in sizeof in unpacked and packed is only in appletbdrm_msg_information. So, I kept this packed, and removed it from others. The Touch Bar still works.
>
> So maybe keep just this packed?
The fields in the TCP header are aligned by design. Unfortunately, this
hardware's protocol is not. And there's no way of fixing this now. Just
keep all of them packed if you want. At least it's clear then what
happens. And if your hardware requires this, you can't do much anyway.
Best regards
Thomas
>>
>>
>> ...
>>
>>>>> + if (response->msg == APPLETBDRM_MSG_SIGNAL_READINESS) {
>>>>> + if (!readiness_signal_received) {
>>>>> + readiness_signal_received = true;
>>>>> + goto retry;
>>>>> + }
>>>>> +
>>>>> + drm_err(drm, "Encountered unexpected readiness signal\n");
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (actual_size != size) {
>>>>> + drm_err(drm, "Actual size (%d) doesn't match expected size (%lu)\n",
>>>>> + actual_size, size);
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (response->msg != expected_response) {
>>>>> + drm_err(drm, "Unexpected response from device (expected %p4ch found %p4ch)\n",
>>>>> + &expected_response, &response->msg);
>>>>> + return -EIO;
>>>> For three different cases the same error code, can it be adjusted more to the
>>>> situation?
>>> All these are I/O errors, you got any suggestion?
>> Your email client mangled the code so badly that it's hard to read. But I would
>> suggest to use -EINTR in the first case, and -EBADMSG. But also you may consider
>> -EPROTO.
> Thanks
>>>>> + }
>> ...
>>
>>>>> + if (ret)
>>>>> + return ret;
>>>>> + else if (!new_plane_state->visible)
>>>> Why 'else'? It's redundant.
>>> I’ve just followed what other drm drivers are doing here:
>>>
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/bochs.…
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/cirrus…
>>>
>>> And plenty more
>> A bad example is still a bad example. 'else' is simply redundant in this
>> case and add a noisy to the code.
>>
>>> I won’t mind removing else. You want that?
>> Sure.
>>
>> ...
>>
>>>>> + request_size = ALIGN(sizeof(struct appletbdrm_fb_request) +
>>>>> + frames_size +
>>>>> + sizeof(struct appletbdrm_fb_request_footer), 16);
>>>> Missing header for ALIGN().
>>>>
>>>> But have you checked overflow.h for the possibility of using some helper macros
>>>> from there? This is what should be usually done for k*alloc() in the kernel.
>>> I don’t really think we need a macro here.
>> Hmm... is frames_size known to be in a guaranteed range to make sure no
>> potential overflow happens?
> I don’t really find any cause of potential overflow.
>
>
>>>>> + appletbdrm_state->request = kzalloc(request_size, GFP_KERNEL);
>>>>> +
>>>>> + if (!appletbdrm_state->request)
>>>>> + return -ENOMEM;
>> ...
>>
>>>>> + request->msg_id = timestamp & 0xff;
>>>> Why ' & 0xff'?
>>> https://github.com/imbushuo/DFRDisplayKm/blob/master/src/DFRDisplayKm/DfrDi…
>> This is not an answer.
>> Why do you need this here? Isn't the type of msg_id enough?
> Hmm, I double checked this. msg_id is u8 in the Linux port so would anyways never exceed 0xff. I’ll remove this.
> Its different in the Windows driver.
>> ...
>>
>>>>> + adev->mode = (struct drm_display_mode) {
>>>> Why do you need a compound literal here? Perhaps you want to have that to be
>>>> done directly in DRM_MODE_INIT()?
>>> I really don’t find this as an issue. You want me to declare another structure, basically this?:
>> Nope, I'm asking if the DRM_MODE_INIT() is done in a way that it only can be
>> used for the static data. Seems like the case. Have you tried to convert
>> DRM_MODE_INIT() to be always a compound literal? Does it break things?
> Seems to be breaking things.
>>> struct drm_display_mode mode = {
>>> DRM_MODE_INIT(60, adev->height, adev->width,
>>> DRM_MODE_RES_MM(adev->height, 218),
>>> DRM_MODE_RES_MM(adev->width, 218))
>>> };
>>> adev->mode = mode;
>>>
>>>>> + DRM_MODE_INIT(60, adev->height, adev->width,
>>>>> + DRM_MODE_RES_MM(adev->height, 218),
>>>>> + DRM_MODE_RES_MM(adev->width, 218))
>>>>> + };
>> --
>> With Best Regards,
>> Andy Shevchenko
>>
>>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)