From: Rob Clark <robdclark(a)chromium.org>
Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse
Memory[2] in the form of:
1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/
MAP_NULL/UNMAP commands
2. Extending the SUBMIT ioctl to allow submitting batches of one or more
MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue
The UABI takes a slightly different approach from what other drivers have
done, and what would make sense if starting from a clean sheet, ie separate
VM_BIND and EXEC ioctls. But since we have to maintain support for the
existing SUBMIT ioctl, and because the fence, syncobj, and BO pinning is
largely the same between legacy "BO-table" style SUBMIT ioctls, and new-
style VM updates submitted to a VM_BIND submitqueue, I chose to go the
route of extending the existing `SUBMIT` ioctl rather than adding a new
ioctl.
I also did not implement support for synchronous VM_BIND commands. Since
userspace could just immediately wait for the `SUBMIT` to complete, I don't
think we need this extra complexity in the kernel. Synchronous/immediate
VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue.
The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533
This series can be found in MR form, if you prefer:
https://gitlab.freedesktop.org/drm/msm/-/merge_requests/144
Changes in v2:
- Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been
merged.
- Pre-allocate all the things, and drop HACK patch which disabled shrinker.
This includes ensuring that vm_bo objects are allocated up front, pre-
allocating VMA objects, and pre-allocating pages used for pgtable updates.
The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that
were initially added for panthor.
- Add back support for BO dumping for devcoredump.
- Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c…
[1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm
[2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html
[3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700
Rob Clark (34):
drm/gpuvm: Don't require obj lock in destructor path
drm/gpuvm: Remove bogus lock assert
drm/gpuvm: Allow VAs to hold soft reference to BOs
drm/gpuvm: Add drm_gpuvm_sm_unmap_va()
drm/msm: Rename msm_file_private -> msm_context
drm/msm: Improve msm_context comments
drm/msm: Rename msm_gem_address_space -> msm_gem_vm
drm/msm: Remove vram carveout support
drm/msm: Collapse vma allocation and initialization
drm/msm: Collapse vma close and delete
drm/msm: drm_gpuvm conversion
drm/msm: Use drm_gpuvm types more
drm/msm: Split submit_pin_objects()
drm/msm: Lazily create context VM
drm/msm: Add opt-in for VM_BIND
drm/msm: Mark VM as unusable on faults
drm/msm: Extend SUBMIT ioctl for VM_BIND
drm/msm: Add VM_BIND submitqueue
drm/msm: Add _NO_SHARE flag
drm/msm: Split out helper to get iommu prot flags
drm/msm: Add mmu support for non-zero offset
drm/msm: Add PRR support
drm/msm: Rename msm_gem_vma_purge() -> _unmap()
drm/msm: Split msm_gem_vma_new()
drm/msm: Pre-allocate VMAs
drm/msm: Pre-allocate vm_bo objects
drm/msm: Pre-allocate pages for pgtable entries
drm/msm: Wire up gpuvm ops
drm/msm: Wire up drm_gpuvm debugfs
drm/msm: Crashdump prep for sparse mappings
drm/msm: rd dumping prep for sparse mappings
drm/msm: Crashdec support for sparse
drm/msm: rd dumping support for sparse
drm/msm: Bump UAPI version
drivers/gpu/drm/drm_gpuvm.c | 141 ++--
drivers/gpu/drm/msm/Kconfig | 1 +
drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +-
drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +-
drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +-
drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +-
drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 24 +-
drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +-
drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 51 +-
drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +-
drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +-
drivers/gpu/drm/msm/adreno/adreno_device.c | 4 -
drivers/gpu/drm/msm/adreno/adreno_gpu.c | 84 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +-
.../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +-
drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +-
drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +-
drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +-
drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +-
drivers/gpu/drm/msm/msm_debugfs.c | 20 +
drivers/gpu/drm/msm/msm_drv.c | 176 ++---
drivers/gpu/drm/msm/msm_drv.h | 35 +-
drivers/gpu/drm/msm/msm_fb.c | 18 +-
drivers/gpu/drm/msm/msm_fbdev.c | 2 +-
drivers/gpu/drm/msm/msm_gem.c | 437 +++++-----
drivers/gpu/drm/msm/msm_gem.h | 226 ++++--
drivers/gpu/drm/msm/msm_gem_prime.c | 15 +
drivers/gpu/drm/msm/msm_gem_submit.c | 234 +++++-
drivers/gpu/drm/msm/msm_gem_vma.c | 748 ++++++++++++++++--
drivers/gpu/drm/msm/msm_gpu.c | 146 ++--
drivers/gpu/drm/msm/msm_gpu.h | 132 +++-
drivers/gpu/drm/msm/msm_iommu.c | 285 ++++++-
drivers/gpu/drm/msm/msm_kms.c | 18 +-
drivers/gpu/drm/msm/msm_kms.h | 2 +-
drivers/gpu/drm/msm/msm_mmu.h | 38 +-
drivers/gpu/drm/msm/msm_rd.c | 62 +-
drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +-
drivers/gpu/drm/msm/msm_submitqueue.c | 86 +-
include/drm/drm_gpuvm.h | 14 +-
include/uapi/drm/msm_drm.h | 98 ++-
52 files changed, 2359 insertions(+), 1060 deletions(-)
--
2.48.1
Am 18.03.25 um 20:22 schrieb Daniel Almeida:
> From: Asahi Lina <lina(a)asahilina.net>
>
> Drivers may want to support driver-private objects, which cannot be
> shared. This allows them to share a single lock and enables other
> optimizations.
>
> Add an `exportable` field to drm_gem_object, which blocks PRIME export
> if set to false. It is initialized to true in
> drm_gem_private_object_init.
We already have a method for doing that which is used by almost all drivers (except for lsdc).
Basically you just create a function which checks the per-requisites if a buffer can be exported before calling drm_gem_prime_export() and installs that as .export callback into the drm_gem_object_funcs.
See amdgpu_gem_prime_export() for a simpler example.
Regards,
Christian.
>
> Signed-off-by: Asahi Lina <lina(a)asahilina.net>
> Signed-off-by: Daniel Almeida <daniel.almeida(a)collabora.com>
> ---
> drivers/gpu/drm/drm_gem.c | 1 +
> drivers/gpu/drm/drm_prime.c | 5 +++++
> include/drm/drm_gem.h | 8 ++++++++
> 3 files changed, 14 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
> index ee811764c3df4b4e9c377a66afd4967512ba2001..8f998fe6beecd285ce3e2d5badfa95eb7d7bd548 100644
> --- a/drivers/gpu/drm/drm_gem.c
> +++ b/drivers/gpu/drm/drm_gem.c
> @@ -195,6 +195,7 @@ void drm_gem_private_object_init(struct drm_device *dev,
>
> drm_vma_node_reset(&obj->vma_node);
> INIT_LIST_HEAD(&obj->lru_node);
> + obj->exportable = true;
> }
> EXPORT_SYMBOL(drm_gem_private_object_init);
>
> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c
> index 32a8781cfd67b82ece7b7b94625715171bb41917..20aa350280abe9a6ed6742e131ff50c65bc9dfa9 100644
> --- a/drivers/gpu/drm/drm_prime.c
> +++ b/drivers/gpu/drm/drm_prime.c
> @@ -387,6 +387,11 @@ static struct dma_buf *export_and_register_object(struct drm_device *dev,
> return dmabuf;
> }
>
> + if (!obj->exportable) {
> + dmabuf = ERR_PTR(-EINVAL);
> + return dmabuf;
> + }
> +
> if (obj->funcs && obj->funcs->export)
> dmabuf = obj->funcs->export(obj, flags);
> else
> diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
> index fdae947682cd0b7b06db5e35e120f049a0f30179..f700e4996eccb92597cca6b8c3df8e35b864c1e1 100644
> --- a/include/drm/drm_gem.h
> +++ b/include/drm/drm_gem.h
> @@ -432,6 +432,14 @@ struct drm_gem_object {
> * The current LRU list that the GEM object is on.
> */
> struct drm_gem_lru *lru;
> +
> + /**
> + * @exportable:
> + *
> + * Whether this GEM object can be exported via the drm_gem_object_funcs->export
> + * callback. Defaults to true.
> + */
> + bool exportable;
> };
>
> /**
>
Le mardi 04 mars 2025 à 13:15 +0530, Sumit Garg a écrit :
> On Tue, Mar 04, 2025 at 08:17:23AM +0100, Jens Wiklander wrote:
> > Hi Daniel,
> >
> > On Fri, Feb 21, 2025 at 3:12 PM Daniel Stone <daniel(a)fooishbar.org> wrote:
> > >
> > > Hi Sumit,
> > >
> > > On Fri, 21 Feb 2025 at 11:24, Sumit Garg <sumit.garg(a)linaro.org> wrote:
> > > > On Tue, 18 Feb 2025 at 21:52, Daniel Stone <daniel(a)fooishbar.org> wrote:
> > > > > dma-heaps was created to solve the problem of having too many
> > > > > 'allocate $n bytes from $specialplace' uAPIs. The proliferation was
> > > > > painful and making it difficult for userspace to do what it needed to
> > > > > do. Userspace doesn't _yet_ make full use of it, but the solution is
> > > > > to make userspace make full use of it, not to go create entirely
> > > > > separate allocation paths for unclear reasons.
> > > > >
> > > > > Besides, I'm writing this from a platform that implements SVP not via
> > > > > TEE. I've worked on platforms which implement SVP without any TEE,
> > > > > where the TEE implementation would be at best a no-op stub, and at
> > > > > worst flat-out impossible.
> > > >
> > > > Can you elaborate the non-TEE use-case for Secure Video Path (SVP) a
> > > > bit more? As to how the protected/encrypted media content pipeline
> > > > works? Which architecture support does your use-case require? Is there
> > > > any higher privileged level firmware interaction required to perform
> > > > media content decryption into restricted memory? Do you plan to
> > > > upstream corresponding support in near future?
> > >
> > > You can see the MTK SVP patches on list which use the MTK SMC to mediate it.
> > >
> > > There are TI Jacinto platforms which implement a 'secure' area
> > > configured statically by (IIRC) BL2, with static permissions defined
> > > for each AXI endpoint, e.g. CPU write + codec RW + dispc read. I've
> > > heard of another SoC vendor doing the same, but I don't think I can
> > > share those details. There is no TEE interaction.
> > >
> > > I'm writing this message from an AMD laptop which implements
> > > restricted content paths outside of TEE. I don't have the full picture
> > > of how SVP is implemented on AMD systems, but I do know that I don't
> > > have any TEE devices exposed.
> > >
> > > > Let me try to elaborate on the Secure Video Path (SVP) flow requiring
> > > > a TEE implementation (in general terms a higher privileged firmware
> > > > managing the pipeline as the kernel/user-space has no access
> > > > permissions to the plain text media content):
> > > >
> > > > - [...]
> > >
> > > Yeah, I totally understand the TEE usecase. I think that TEE is a good
> > > design to implement this. I think that TEE should be used for SVP
> > > where it makes sense.
> > >
> > > Please understand that I am _not_ arguing that no-one should use TEE for SVP!
> > >
> > > > > So, again, let's
> > > > > please turn this around: _why_ TEE? Who benefits from exposing this as
> > > > > completely separate to the more generic uAPI that we specifically
> > > > > designed to handle things like this?
> > > >
> > > > The bridging between DMA heaps and TEE would still require user-space
> > > > to perform an IOCTL into TEE to register the DMA-bufs as you can see
> > > > here [1]. Then it will rather be two handles for user-space to manage.
> > >
> > > Yes, the decoder would need to do this. That's common though: if you
> > > want to share a buffer between V4L2 and DRM, you have three handles:
> > > the V4L2 buffer handle, the DRM GEM handle, and the dmabuf you use to
> > > bridge the two.
> > >
> > > > Similarly during restricted memory allocation/free we need another
> > > > glue layer under DMA heaps to TEE subsystem.
> > >
> > > Yep.
> > >
> > > > The reason is simply which has been iterated over many times in the
> > > > past threads that:
> > > >
> > > > "If user-space has to interact with a TEE device for SVP use-case
> > > > then why it's not better to ask TEE to allocate restricted DMA-bufs
> > > > too"
> > >
> > > The first word in your proposition is load-bearing.
> > >
> > > Build out the usecase a little more here. You have a DRMed video
> > > stream coming in, which you need to decode (involving TEE for this
> > > usecase). You get a dmabuf handle to the decoded frame. You need to
> > > pass the dmabuf across to the Wayland compositor. The compositor needs
> > > to pass it to EGL/Vulkan to import and do composition, which in turn
> > > passes it to the GPU DRM driver. The output of the composition is in
> > > turn shared between the GPU DRM driver and the separate KMS DRM
> > > driver, with the involvement of GBM.
> > >
> > > For the platforms I'm interested in, the GPU DRM driver needs to
> > > switch into protected mode, which has no involvement at all with TEE -
> > > it's architecturally impossible to have TEE involved without moving
> > > most of the GPU driver into TEE and destroying performance. The
> > > display hardware also needs to engage protected mode, which again has
> > > no involvement with TEE and again would need to have half the driver
> > > moved into TEE for no benefit in order to do so. The Wayland
> > > compositor also has no interest in TEE: it tells the GPU DRM driver
> > > about the protected status of its buffers, and that's it.
> > >
> > > What these components _are_ opinionated about, is the way buffers are
> > > allocated and managed. We built out dmabuf modifiers for this usecase,
> > > and we have a good negotiation protocol around that. We also really
> > > care about buffer placement in some usecases - e.g. some display/codec
> > > hardware requires buffers to be sourced from contiguous memory, other
> > > hardware needs to know that when it shares buffers with another
> > > device, it needs to place the buffers outside of inaccessible/slow
> > > local RAM. So we built out dma-heaps, so every part of the component
> > > in the stack can communicate their buffer-placement needs in the same
> > > way as we do modifiers, and negotiate an acceptable allocation.
> > >
> > > That's my starting point for this discussion. We have a mechanism to
> > > deal with the fact that buffers need to be shared between different IP
> > > blocks which have their own constraints on buffer placement, avoiding
> > > the current problem of having every subsystem reinvent their own
> > > allocation uAPI which was burying us in impedance mismatch and
> > > confusion. That mechanism is dma-heaps. It seems like your starting
> > > point from this discussion is that you've implemented a TEE-centric
> > > design for SVP, and so all of userspace should bypass our existing
> > > cross-subsystem special-purpose allocation mechanism, and write
> > > specifically to one implementation. I believe that is a massive step
> > > backwards and an immediate introduction of technical debt.
> > >
> > > Again, having an implementation of SVP via TEE makes a huge amount of
> > > sense. Having _most_ SVP implementations via TEE still makes a lot of
> > > sense. Having _all_ SVP implementations eventually be via TEE would
> > > still make sense. But even if we were at that point - which we aren't
> > > - it still doesn't justify telling userspace 'use the generic dma-heap
> > > uAPI for every device-specific allocation constraint, apart from SVP
> > > which has a completely different way to allocate some bytes'.
> >
> > I must admit that I don't see how this makes a significant difference,
> > but then I haven't hacked much in the stacks you're talking about, so
> > I'm going to take your word for it.
> >
> > I've experimented with providing a dma-heap replacing the TEE API. The
> > implementation is more complex than I first anticipated, adding about
> > 400 lines to the patch set.
>
> I did anticipated this but let's give it a try and see if DMA heaps
> really adds any value from user-space point of view. If it does then it
> will be worth the maintenence overhead.
>
> > From user space, it looks like another
> > dma-heap. I'm using the names you gave earlier,
> > protected,secure-video, protected,trusted-ui, and
> > protected,secure-video-record. However, I wonder if we shouldn't use
> > "restricted" instead of "protected" since we had agreed to call it
> > restricted memory earlier.
>
> Let's stick with "restricted" memory buffer references only.
Until now, we didn't have a standard to balance our naming choice, we
simply wanted to move away from "secure" which didn't mean much, and
restricted met our needs. I think the discussion is worth having again,
now that there is a standard that decided toward "protected". Matchcing
the Khronos standard means reducing a lot of confusion.
https://docs.vulkan.org/guide/latest/protected.html
regards,
Nicolas
On Mon, Jan 20, 2025 at 08:45:51PM +1100, Alexey Kardashevskiy wrote:
> > For CC I'm expecting the KVM fd to be the handle for the cVM, so any
> > RPCs that want to call into the secure world need the KVM FD to get
> > the cVM's identifier. Ie a "bind to cVM" RPC will need the PCI
> > information and the cVM's handle.
>
> And keep KVM fd open until unbind? Or just for the short time to call the
> PSP?
iommufd will keep the KVM fd alive so long as the vIOMMU object
exists. Other uses for kvm require it to work like this.
> > But it also seems to me that VFIO should be able to support putting
> > the device into the RUN state without involving KVM or cVMs.
>
> AMD's TDI bind handler in the PSP wants a guest handle ("GCTX") and a guest
> device BDFn, and VFIO has no desire to dive into this KVM business beyond
> IOMMUFD.
As in my other email, VFIO is not restricted to running VMs, useful
things should be available to apps like DPDK.
There is a use case for using TDISP and getting devices up into an
ecrypted/attested state on pure bare metal without any KVM, VFIO
should work in that use case too.
Jason
On 17/03/2025 03:37, feng.wei8(a)zte.com.cn wrote:
> From: FengWei <feng.wei8(a)zte.com.cn>
>
> Use max3() macro instead of nesting max() to simplify the return
> statement.
>
> Signed-off-by: FengWei <feng.wei8(a)zte.com.cn>
> ---
> v3 -> v4
> fix the format of this patch.
> drivers/dma-buf/dma-heap.c | 2 +-
You sent five versions per day of the same patch.
Look what was in v3:
v2 -> v3
fix the format of this patch
So you are doing the same over and over and sending it to us?
Srsly, ZTE, slow down and be sure you follow the process BEFORE you send
flood of patches like that and learn on the go.
Best regards,
Krzysztof
On 12-02-25, 17:35, Jyothi Kumar Seerapu wrote:
> GSI hardware generates an interrupt for each transfer completion.
> For multiple messages within a single transfer, this results in
> N interrupts for N messages, leading to significant software
> interrupt latency.
>
> To mitigate this latency, utilize Block Event Interrupt (BEI) mechanism.
> Enabling BEI instructs the GSI hardware to prevent interrupt generation
> and BEI is disabled when an interrupt is necessary.
>
> When using BEI, consider splitting a single multi-message transfer into
> chunks of 8 messages internally and so interrupts are not expected for
> the first 7 message completions, only the last message triggers
> an interrupt, indicating the completion of 8 messages.
>
> This BEI mechanism enhances overall transfer efficiency.
That sounds good but I dont like the idea that we add a custom interface
for this. Please use DMA_PREP_INTERRUPT instead. Adding this flag should
trigger N interrupts, absence of this should lead to Block events only
--
~Vinod
Following a recent discussion at last Plumbers, John Stultz, Sumit
Sewal, TJ Mercier and I came to an agreement that we should document
what the dma-buf heaps names are expected to be, and what the buffers
attributes you'll get should be documented.
Let's create that doc to make sure those attributes and names are
guaranteed going forward.
Reviewed-by: T.J. Mercier <tjmercier(a)google.com>
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes from v3:
- Add TJ RB
- Fix a few typos.
Changes from v2:
- Remove exhaustive list of names for platforms, and just mention the
alternatives.
- Add MAINTAINERS entry
Changes from v1:
- Add the mention that the cma / reserved heap is optional.
---
Documentation/userspace-api/dma-buf-heaps.rst | 25 +++++++++++++++++++
Documentation/userspace-api/index.rst | 1 +
MAINTAINERS | 1 +
3 files changed, 27 insertions(+)
create mode 100644 Documentation/userspace-api/dma-buf-heaps.rst
diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst
new file mode 100644
index 000000000000..535f49047ce6
--- /dev/null
+++ b/Documentation/userspace-api/dma-buf-heaps.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Allocating dma-buf using heaps
+==============================
+
+Dma-buf Heaps are a way for userspace to allocate dma-buf objects. They are
+typically used to allocate buffers from a specific allocation pool, or to share
+buffers across frameworks.
+
+Heaps
+=====
+
+A heap represents a specific allocator. The Linux kernel currently supports the
+following heaps:
+
+ - The ``system`` heap allocates virtually contiguous, cacheable, buffers.
+
+ - The ``cma`` heap allocates physically contiguous, cacheable,
+ buffers. Only present if a CMA region is present. Such a region is
+ usually created either through the kernel commandline through the
+ `cma` parameter, a memory region Device-Tree node with the
+ `linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or
+ `CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it
+ might be called ``reserved``, ``linux,cma``, or ``default-pool``.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index b1395d94b3fd..9cbe4390c872 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -42,10 +42,11 @@ Devices and I/O
.. toctree::
:maxdepth: 1
accelerators/ocxl
+ dma-buf-heaps
dma-buf-alloc-exchange
gpio/index
iommufd
media/index
dcdbas
diff --git a/MAINTAINERS b/MAINTAINERS
index 8e0736dc2ee0..ef0364e65aef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6911,10 +6911,11 @@ R: T.J. Mercier <tjmercier(a)google.com>
L: linux-media(a)vger.kernel.org
L: dri-devel(a)lists.freedesktop.org
L: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers)
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
+F: Documentation/userspace-api/dma-buf-heaps.rst
F: drivers/dma-buf/dma-heap.c
F: drivers/dma-buf/heaps/*
F: include/linux/dma-heap.h
F: include/uapi/linux/dma-heap.h
F: tools/testing/selftests/dmabuf-heaps/
--
2.48.1
Following a recent discussion at last Plumbers, John Stultz, Sumit
Sewal, TJ Mercier and I came to an agreement that we should document
what the dma-buf heaps names are expected to be, and what the buffers
attributes you'll get should be documented.
Let's create that doc to make sure those attributes and names are
guaranteed going forward.
Signed-off-by: Maxime Ripard <mripard(a)kernel.org>
---
Changes from v2:
- Remove exhaustive list of names for platforms, and just mention the
alternatives.
- Add MAINTAINERS entry
Changes from v1:
- Add the mention that the cma / reserved heap is optional.
---
Documentation/userspace-api/dma-buf-heaps.rst | 25 +++++++++++++++++++
Documentation/userspace-api/index.rst | 1 +
MAINTAINERS | 1 +
3 files changed, 27 insertions(+)
create mode 100644 Documentation/userspace-api/dma-buf-heaps.rst
diff --git a/Documentation/userspace-api/dma-buf-heaps.rst b/Documentation/userspace-api/dma-buf-heaps.rst
new file mode 100644
index 000000000000..5b92d69646f6
--- /dev/null
+++ b/Documentation/userspace-api/dma-buf-heaps.rst
@@ -0,0 +1,25 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==============================
+Allocating dma-buf using heaps
+==============================
+
+Dma-buf Heaps are a way for userspace to allocate dma-buf objects. They are
+typically used to allocate buffers from a specific allocation pool, or to share
+buffers across frameworks.
+
+Heaps
+=====
+
+A heap represent a specific allocator. The Linux kernel currently supports the
+following heaps:
+
+ - The ``system`` heap allocates virtually contiguous, cacheable, buffers
+
+ - The ``cma`` heap allocates physically contiguous, cacheable,
+ buffers. Only present if a CMA region is present. Such a region is
+ usually created either through the kernel commandline through the
+ `cma` parameter, a memory region Device-Tree node with the
+ `linux,cma-default` property set, or through the `CMA_SIZE_MBYTES` or
+ `CMA_SIZE_PERCENTAGE` Kconfig options. Depending on the platform, it
+ might be called ``reserved``, ``linux,cma``, or ``default-pool``.
diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspace-api/index.rst
index b1395d94b3fd..9cbe4390c872 100644
--- a/Documentation/userspace-api/index.rst
+++ b/Documentation/userspace-api/index.rst
@@ -42,10 +42,11 @@ Devices and I/O
.. toctree::
:maxdepth: 1
accelerators/ocxl
+ dma-buf-heaps
dma-buf-alloc-exchange
gpio/index
iommufd
media/index
dcdbas
diff --git a/MAINTAINERS b/MAINTAINERS
index 8e0736dc2ee0..ef0364e65aef 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6911,10 +6911,11 @@ R: T.J. Mercier <tjmercier(a)google.com>
L: linux-media(a)vger.kernel.org
L: dri-devel(a)lists.freedesktop.org
L: linaro-mm-sig(a)lists.linaro.org (moderated for non-subscribers)
S: Maintained
T: git https://gitlab.freedesktop.org/drm/misc/kernel.git
+F: Documentation/userspace-api/dma-buf-heaps.rst
F: drivers/dma-buf/dma-heap.c
F: drivers/dma-buf/heaps/*
F: include/linux/dma-heap.h
F: include/uapi/linux/dma-heap.h
F: tools/testing/selftests/dmabuf-heaps/
--
2.48.1
On Fri, 14 Feb 2025 at 21:19, Boris Brezillon
<boris.brezillon(a)collabora.com> wrote:
>
> On Fri, 14 Feb 2025 18:37:14 +0530
> Sumit Garg <sumit.garg(a)linaro.org> wrote:
>
> > On Fri, 14 Feb 2025 at 15:37, Jens Wiklander <jens.wiklander(a)linaro.org> wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Feb 13, 2025 at 6:39 PM Daniel Stone <daniel(a)fooishbar.org> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Thu, 13 Feb 2025 at 15:57, Jens Wiklander <jens.wiklander(a)linaro.org> wrote:
> > > > > On Thu, Feb 13, 2025 at 3:05 PM Daniel Stone <daniel(a)fooishbar.org> wrote:
> > > > > > But just because TEE is one good backend implementation, doesn't mean
> > > > > > it should be the userspace ABI. Why should userspace care that TEE has
> > > > > > mediated the allocation instead of it being a predefined range within
> > > > > > DT?
> > > > >
> > > > > The TEE may very well use a predefined range that part is abstracted
> > > > > with the interface.
> > > >
> > > > Of course. But you can also (and this has been shipped on real
> > > > devices) handle this without any per-allocation TEE needs by simply
> > > > allocating from a memory range which is predefined within DT.
> > > >
> > > > From the userspace point of view, why should there be one ABI to
> > > > allocate memory from a predefined range which is delivered by DT to
> > > > the kernel, and one ABI to allocate memory from a predefined range
> > > > which is mediated by TEE?
> > >
> > > We need some way to specify the protection profile (or use case as
> > > I've called it in the ABI) required for the buffer. Whether it's
> > > defined in DT seems irrelevant.
> > >
> > > >
> > > > > > What advantage
> > > > > > does userspace get from having to have a different codepath to get a
> > > > > > different handle to memory? What about x86?
> > > > > >
> > > > > > I think this proposal is looking at it from the wrong direction.
> > > > > > Instead of working upwards from the implementation to userspace, start
> > > > > > with userspace and work downwards. The interesting property to focus
> > > > > > on is allocating memory, not that EL1 is involved behind the scenes.
> > > > >
> > > > > From what I've gathered from earlier discussions, it wasn't much of a
> > > > > problem for userspace to handle this. If the kernel were to provide it
> > > > > via a different ABI, how would it be easier to implement in the
> > > > > kernel? I think we need an example to understand your suggestion.
> > > >
> > > > It is a problem for userspace, because we need to expose acceptable
> > > > parameters for allocation through the entire stack. If you look at the
> > > > dmabuf documentation in the kernel for how buffers should be allocated
> > > > and exchanged, you can see the negotiation flow for modifiers. This
> > > > permeates through KMS, EGL, Vulkan, Wayland, GStreamer, and more.
> > >
> > > What dma-buf properties are you referring to?
> > > dma_heap_ioctl_allocate() accepts a few flags for the resulting file
> > > descriptor and no flags for the heap itself.
> > >
> > > >
> > > > Standardising on heaps allows us to add those in a similar way.
> > >
> > > How would you solve this with heaps? Would you use one heap for each
> > > protection profile (use case), add heap_flags, or do a bit of both?
>
> I would say one heap per-profile.
>
And then it would have a per vendor multiplication factor as each
vendor enforces memory restriction in a platform specific manner which
won't scale.
> >
> > Christian gave an historical background here [1] as to why that hasn't
> > worked in the past with DMA heaps given the scalability issues.
> >
> > [1] https://lore.kernel.org/dri-devel/e967e382-6cca-4dee-8333-39892d532f71@gmai…
>
> Hm, I fail to see where Christian dismiss the dma-heaps solution in
> this email. He even says:
>
> > If the memory is not physically attached to any device, but rather just
> memory attached to the CPU or a system wide memory controller then
> expose the memory as DMA-heap with specific requirements (e.g. certain
> sized pages, contiguous, restricted, encrypted, ...).
I am not saying Christian dismissed DMA heaps but rather how
scalability is an issue. What we are proposing here is a generic
interface via TEE to the firmware/Trusted OS which can perform all the
platform specific memory restrictions. This solution will scale across
vendors.
>
> >
> > >
> > > > If we
> > > > have to add different allocation mechanisms, then the complexity
> > > > increases, permeating not only into all the different userspace APIs,
> > > > but also into the drivers which need to support every different
> > > > allocation mechanism even if they have no opinion on it - e.g. Mali
> > > > doesn't care in any way whether the allocation comes from a heap or
> > > > TEE or ACPI or whatever, it cares only that the memory is protected.
> > > >
> > > > Does that help?
> > >
> > > I think you're missing the stage where an unprotected buffer is
> > > received and decrypted into a protected buffer. If you use the TEE for
> > > decryption or to configure the involved devices for the use case, it
> > > makes sense to let the TEE allocate the buffers, too. A TEE doesn't
> > > have to be an OS in the secure world, it can be an abstraction to
> > > support the use case depending on the design. So the restricted buffer
> > > is already allocated before we reach Mali in your example.
> > >
> > > Allocating restricted buffers from the TEE subsystem saves us from
> > > maintaining proxy dma-buf heaps.
>
> Honestly, when I look at dma-heap implementations, they seem
> to be trivial shells around existing (more complex) allocators, and the
> boiler plate [1] to expose a dma-heap is relatively small. The dma-buf
> implementation, you already have, so we're talking about a hundred
> lines of code to maintain, which shouldn't be significantly more than
> what you have for the new ioctl() to be honest.
It will rather be redundant vendor specific code under DMA heaps
calling into firmware/Trusted OS to enforce memory restrictions as you
can look into Mediatek example [1]. With TEE subsystem managing that
it won't be the case as we will provide a common abstraction for the
communication with underlying firmware/Trusted OS.
[1] https://lore.kernel.org/linux-arm-kernel/20240515112308.10171-1-yong.wu@med…
> And I'll insist on what
> Daniel said, it's a small price to pay to have a standard interface to
> expose to userspace. If dma-heaps are not used for this kind things, I
> honestly wonder what they will be used for...
Let's try not to forcefully find a use-case for DMA heaps when there
is a better alternative available. I am still failing to see why you
don't consider following as a standardised user-space interface:
"When user-space has to work with restricted memory, ask TEE device to
allocate it"
-Sumit
Hi
Am 21.02.25 um 16:51 schrieb andriy.shevchenko(a)linux.intel.com:
> On Fri, Feb 21, 2025 at 11:36:00AM +0000, Aditya Garg wrote:
>> From: Kerem Karabay <kekrby(a)gmail.com>
>>
>> Add XRGB8888 emulation helper for devices that only support BGR888.
> ...
>
>> + for (x = 0; x < pixels; x++) {
>> + pix = le32_to_cpu(sbuf32[x]);
>> + /* write red-green-blue to output in little endianness */
>> + *dbuf8++ = (pix & 0x00ff0000) >> 16;
>> + *dbuf8++ = (pix & 0x0000ff00) >> 8;
>> + *dbuf8++ = (pix & 0x000000ff) >> 0;
> put_unaligned_be24()
I'm all for sharing helper code, but maybe not here.
- DRM pixel formats are always little endian.
- CPU encoding is LE or BE.
- Pixel-component order can be anything: RGB/BGR/etc.
So the code has a comment to explain what happens here. Adding that call
with the _be24 postfix into the mix would make it more confusing.
>
>> + }
> ...
>
>> + static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = {
>> + 3,
>> + };
> One line?
>
> static const u8 dst_pixsize[DRM_FORMAT_MAX_PLANES] = { 3 };
I'd be ok, if there's a string preference in the kernel to use thins
style. Most of DRM doesn't AFAIK.
Best regards
Thomas
>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
Hi
Am 22.02.25 um 10:07 schrieb Aditya Garg:
>> What padding, please? Why TCP UAPI headers do not have these attributes?
>> Think about it, and think about what actually __packed does and how it affects
>> (badly) the code generation. Otherwise it looks like a cargo cult.
>>
>>> I tried removing __packed btw and driver no longer works.
>> So, you need to find a justification why. But definitely not due to padding in
>> many of them. They can go without __packed as they are naturally aligned.
> Alright, I did some debugging, basically printk sizeof(struct). Did it for both packed and unpacked with the following results:
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header is 16
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_request_header_unpacked is 16
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header is 20
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_response_header_unpacked is 20
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request is 32
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_simple_request_unpacked is 32
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information is 65
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_msg_information_unpacked is 68
In the unpacked version, there is a 3-byte gap after the
'bits_per_pixel' to align the next field. Using __packed removes those
gaps at the expense of runtime overhead.
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame is 12
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_frame_unpacked is 12
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer is 80
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_footer_unpacked is 80
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request is 48
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_unpacked is 48
>
> Feb 22 13:02:03 MacBook kernel: size of struct appletbdrm_fb_request_response is 40
> Feb 22 13:02:04 MacBook kernel: size of struct appletbdrm_fb_request_response_unpacked is 40
>
> So, the difference in sizeof in unpacked and packed is only in appletbdrm_msg_information. So, I kept this packed, and removed it from others. The Touch Bar still works.
>
> So maybe keep just this packed?
The fields in the TCP header are aligned by design. Unfortunately, this
hardware's protocol is not. And there's no way of fixing this now. Just
keep all of them packed if you want. At least it's clear then what
happens. And if your hardware requires this, you can't do much anyway.
Best regards
Thomas
>>
>>
>> ...
>>
>>>>> + if (response->msg == APPLETBDRM_MSG_SIGNAL_READINESS) {
>>>>> + if (!readiness_signal_received) {
>>>>> + readiness_signal_received = true;
>>>>> + goto retry;
>>>>> + }
>>>>> +
>>>>> + drm_err(drm, "Encountered unexpected readiness signal\n");
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (actual_size != size) {
>>>>> + drm_err(drm, "Actual size (%d) doesn't match expected size (%lu)\n",
>>>>> + actual_size, size);
>>>>> + return -EIO;
>>>>> + }
>>>>> +
>>>>> + if (response->msg != expected_response) {
>>>>> + drm_err(drm, "Unexpected response from device (expected %p4ch found %p4ch)\n",
>>>>> + &expected_response, &response->msg);
>>>>> + return -EIO;
>>>> For three different cases the same error code, can it be adjusted more to the
>>>> situation?
>>> All these are I/O errors, you got any suggestion?
>> Your email client mangled the code so badly that it's hard to read. But I would
>> suggest to use -EINTR in the first case, and -EBADMSG. But also you may consider
>> -EPROTO.
> Thanks
>>>>> + }
>> ...
>>
>>>>> + if (ret)
>>>>> + return ret;
>>>>> + else if (!new_plane_state->visible)
>>>> Why 'else'? It's redundant.
>>> I’ve just followed what other drm drivers are doing here:
>>>
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/bochs.…
>>> https://elixir.bootlin.com/linux/v6.13.3/source/drivers/gpu/drm/tiny/cirrus…
>>>
>>> And plenty more
>> A bad example is still a bad example. 'else' is simply redundant in this
>> case and add a noisy to the code.
>>
>>> I won’t mind removing else. You want that?
>> Sure.
>>
>> ...
>>
>>>>> + request_size = ALIGN(sizeof(struct appletbdrm_fb_request) +
>>>>> + frames_size +
>>>>> + sizeof(struct appletbdrm_fb_request_footer), 16);
>>>> Missing header for ALIGN().
>>>>
>>>> But have you checked overflow.h for the possibility of using some helper macros
>>>> from there? This is what should be usually done for k*alloc() in the kernel.
>>> I don’t really think we need a macro here.
>> Hmm... is frames_size known to be in a guaranteed range to make sure no
>> potential overflow happens?
> I don’t really find any cause of potential overflow.
>
>
>>>>> + appletbdrm_state->request = kzalloc(request_size, GFP_KERNEL);
>>>>> +
>>>>> + if (!appletbdrm_state->request)
>>>>> + return -ENOMEM;
>> ...
>>
>>>>> + request->msg_id = timestamp & 0xff;
>>>> Why ' & 0xff'?
>>> https://github.com/imbushuo/DFRDisplayKm/blob/master/src/DFRDisplayKm/DfrDi…
>> This is not an answer.
>> Why do you need this here? Isn't the type of msg_id enough?
> Hmm, I double checked this. msg_id is u8 in the Linux port so would anyways never exceed 0xff. I’ll remove this.
> Its different in the Windows driver.
>> ...
>>
>>>>> + adev->mode = (struct drm_display_mode) {
>>>> Why do you need a compound literal here? Perhaps you want to have that to be
>>>> done directly in DRM_MODE_INIT()?
>>> I really don’t find this as an issue. You want me to declare another structure, basically this?:
>> Nope, I'm asking if the DRM_MODE_INIT() is done in a way that it only can be
>> used for the static data. Seems like the case. Have you tried to convert
>> DRM_MODE_INIT() to be always a compound literal? Does it break things?
> Seems to be breaking things.
>>> struct drm_display_mode mode = {
>>> DRM_MODE_INIT(60, adev->height, adev->width,
>>> DRM_MODE_RES_MM(adev->height, 218),
>>> DRM_MODE_RES_MM(adev->width, 218))
>>> };
>>> adev->mode = mode;
>>>
>>>>> + DRM_MODE_INIT(60, adev->height, adev->width,
>>>>> + DRM_MODE_RES_MM(adev->height, 218),
>>>>> + DRM_MODE_RES_MM(adev->width, 218))
>>>>> + };
>> --
>> With Best Regards,
>> Andy Shevchenko
>>
>>
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
On Tue, Feb 11, 2025 at 03:32:23PM +0100, Boris Brezillon wrote:
> On Tue, 11 Feb 2025 14:46:56 +0100
> Maxime Ripard <mripard(a)kernel.org> wrote:
>
> > Hi Boris,
> >
> > On Fri, Feb 07, 2025 at 04:02:53PM +0100, Boris Brezillon wrote:
> > > Sorry for joining the party late, a couple of comments to back Akash
> > > and Nicolas' concerns.
> > >
> > > On Wed, 05 Feb 2025 13:14:14 -0500
> > > Nicolas Dufresne <nicolas(a)ndufresne.ca> wrote:
> > >
> > > > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> > > > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:
> > > > > > Hi Maxime, Nicolas
> > > > > >
> > > > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > > > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > > > > > > Hi Nicolas,
> > > > > > > >
> > > > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > > >
> > > > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > > >
> > > > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > > >
> > > > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > > > > heap from which the driver can allocate a protected buffer.
> > > > > > > > > > > It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > > > > Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > > > > Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > > > > driver must import the protected memory from a device specific exporter.
> > > > > > > > > >
> > > > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > > >
> > > > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > > > allocation through the usual dma_alloc_* API?
> > > > > > > > >
> > > > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > > > > > >
> > > > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > > > allocations.
> > > > > > >
> > > > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > > >
> > > > > > > Nicolas
> > > > > > >
> > > > > >
> > > > > > The idea in this design is to abstract the heap management from the
> > > > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > > >
> > > > > > In a system, an integrator would have implemented a secure heap driver,
> > > > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > > > or else. This heap driver would be responsible of implementing the
> > > > > > logic to: allocate, free, refcount, etc.
> > > > > >
> > > > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > > > protected mode. This memory would not belong to a user space process.
> > > > > > The driver allocates it at the time of loading the FW and initialization
> > > > > > of the GPU HW. This is a device globally owned protected memory.
> > > > >
> > > > > The thing is, it's really not clear why you absolutely need to have the
> > > > > Panthor driver involved there. It won't be transparent to userspace,
> > > > > since you'd need an extra flag at allocation time, and the buffers
> > > > > behave differently. If userspace has to be aware of it, what's the
> > > > > advantage to your approach compared to just exposing a heap for those
> > > > > secure buffers, and letting userspace allocate its buffers from there?
> > > >
> > > > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > > > the firmware requires placing the data in a protected memory region, and that
> > > > this aspect has no exposure to userspace, how can Panthor not be implicated ?
> > >
> > > Right, the very reason we need protected memory early is because some
> > > FW sections need to be allocated from the protected pool, otherwise the
> > > TEE will fault as soon at the FW enters the so-called 'protected mode'.
> >
> > How does that work if you don't have some way to allocate the protected
> > memory? You can still submit jobs to the GPU, but you can't submit /
> > execute "protected jobs"?
>
> Exactly.
>
> >
> > > Now, it's not impossible to work around this limitation. For instance,
> > > we could load the FW without this protected section by default (what we
> > > do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> > > ioctl that would take a GEM object imported from a dmabuf allocated
> > > from the protected dma-heap by userspace. We can then reset the FW and
> > > allow it to operate in protected mode after that point.
> >
> > Urgh, I'd rather avoid that dance if possible :)
>
> Me too.
>
> >
> > > This approach has two downsides though:
> > >
> > > 1. We have no way of checking that the memory we're passed is actually
> > > suitable for FW execution in a protected context. If we're passed
> > > random memory, this will likely hang the platform as soon as we enter
> > > protected mode.
> >
> > It's a current limitation of dma-buf in general, and you'd have the same
> > issue right now if someone imports a buffer, or misconfigure the heap
> > for a !protected heap.
> >
> > I'd really like to have some way to store some metadata in dma_buf, if
> > only to tell that the buffer is protected.
>
> The dma_buf has a pointer to its ops, so it should be relatively easy
> to add an is_dma_buf_coming_from_this_heap() helper. Of course this
> implies linking the consumer driver to the heap it's supposed to take
> protected buffers from, which is basically the thing being discussed
> here :-).
I'm not sure looking at the ops would be enough. Like, you can compare
that the buffer you allocated come from the heap you got from the DT,
but if that heap doesn't allocate protected buffers, you're screwed and
you have no way to tell.
It also falls apart if we have a heap driver with multiple instances,
which is pretty likely if we ever merge the carveout heap driver.
> >
> > I suspect you'd also need that if you do things like do protected video
> > playback through a codec, get a protected frame, and want to import that
> > into the GPU. Depending on how you allocate it, either the codec or the
> > GPU or both will want to make sure it's protected.
>
> If it's all allocated from a central "protected" heap (even if that
> goes through the driver calling the dma_heap_alloc_buffer()), it
> shouldn't be an issue.
Right, assuming we have a way to identify the heap the buffer was
allocated from somehow. This kind of assumes that you only ever get one
source of protected memory, and you'd never allocate a protected buffer
from a different one in the codec driver for example.
> > > 2. If the driver already boot the FW and exposed a DRI node, we might
> > > have GPU workloads running, and doing a FW reset might incur a slight
> > > delay in GPU jobs execution.
> > >
> > > I think #1 is a more general issue that applies to suspend buffers
> > > allocated for GPU contexts too. If we expose ioctls where we take
> > > protected memory buffers that can possibly lead to crashes if they are
> > > not real protected memory regions, and we have no way to ensure the
> > > memory is protected, we probably want to restrict these ioctls/modes to
> > > some high-privilege CAP_SYS_.
> > >
> > > For #2, that's probably something we can live with, since it's a
> > > one-shot thing. If it becomes an issue, we can even make sure we enable
> > > the FW protected-mode before the GPU starts being used for real.
> > >
> > > This being said, I think the problem applies outside Panthor, and it
> > > might be that the video codec can't reset the FW/HW block to switch to
> > > protected mode as easily as Panthor.
> > >
> > > Note that there's also downsides to the reserved-memory node approach,
> > > where some bootloader stage would ask the secure FW to reserve a
> > > portion of mem and pass this through the DT. This sort of things tend to
> > > be an integration mess, where you need all the pieces of the stack (TEE,
> > > u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
> > > work properly. If we go the ioctl() way, we restrict the scope to the
> > > TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
> > > but we've ripped the bootloader out of the equation at least.
> >
> > Yeah. I also think there's two discussions in parallel here:
> >
> > 1) Being able to allocate protected buffers from the driver
> > 2) Exposing an interface to allocate those to userspace
> >
> > I'm not really convinced we need 2, but 1 is obviously needed from what
> > you're saying.
>
> I suspect we need #2 for GBM, still. But that's what dma-heaps are for,
> so I don't think that's a problem.
Yeah, that was my point too, we have the heaps for that already.
Maxime
Hi,
[Now with a Gstreamer demo, see below.]
This patch set allocates the restricted DMA-bufs via the TEE subsystem.
The TEE subsystem handles the DMA-buf allocations since it is the TEE
(OP-TEE, AMD-TEE, TS-TEE, or perhaps a future QCOMTEE) which sets up the
restrictions for the memory used for the DMA-bufs.
I've added a new IOCTL, TEE_IOC_RSTMEM_ALLOC, to allocate the restricted
DMA-bufs. This IOCTL reaches the backend TEE driver, allowing it to choose
how to allocate the restricted physical memory.
TEE_IOC_RSTMEM_ALLOC takes in addition to a size and flags parameters also
a use-case parameter. This is used by the backend TEE driver to decide on
allocation policy and which devices should be able to access the memory.
Three use-cases (Secure Video Playback, Trusted UI, and Secure Video
Recording) has been identified so far to serve as examples of what can be
expected. More use-cases can be added in userspace ABI, but it's up to the
backend TEE drivers to provide the implementation.
Each use-case has it's own restricted memory pool since different use-cases
requires isolation from different parts of the system. A restricted memory
pool can be based on a static carveout instantiated while probing the TEE
backend driver, or dynamically allocated from CMA and made restricted as
needed by the TEE.
This can be tested on a RockPi 4B+ with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m rockpi4.xml \
-b prototype/sdp-v5
repo sync -j8
cd build
make toolchains -j$(nproc)
make all -j$(nproc)
# Copy ../out/rockpi4.img to an SD card and boot the RockPi from that
# Connect a monitor to the RockPi
# login and at the prompt:
gst-launch-1.0 videotestsrc ! \
aesenc key=1f9423681beb9a79215820f6bda73d0f \
iv=e9aa8e834d8d70b7e0d254ff670dd718 serialize-iv=true ! \
aesdec key=1f9423681beb9a79215820f6bda73d0f ! \
kmssink
The aesdec module has been hacked to use an OP-TEE TA to decrypt the stream
into restricted DMA-bufs which are consumed by the kmssink.
The primitive QEMU tests from previous patch set can be tested on RockPi
in the same way with:
xtest --sdp-basic
The primitive test are tested on QEMU with the following steps:
repo init -u https://github.com/jenswi-linaro/manifest.git -m qemu_v8.xml \
-b prototype/sdp-v5
repo sync -j8
cd build
make toolchains -j$(nproc)
make SPMC_AT_EL=1 all -j$(nproc)
make SPMC_AT_EL=1 run-only
# login and at the prompt:
xtest --sdp-basic
The SPMC_AT_EL=1 parameter configures the build with FF-A and an SPMC at
S-EL1 inside OP-TEE. The parameter can be changed into SPMC_AT_EL=n to test
without FF-A using the original SMC ABI instead. Please remember to do
%rm -rf ../trusted-firmware-a/build/qemu
for TF-A to be rebuilt properly using the new configuration.
https://optee.readthedocs.io/en/latest/building/prerequisites.html
list dependencies needed to build the above.
The tests are pretty basic, mostly checking that a Trusted Application in
the secure world can access and manipulate the memory. There are also some
negative tests for out of bounds buffers etc.
Thanks,
Jens
Changes since V4:
* Adding the patch "tee: add TEE_IOC_RSTMEM_FD_INFO" needed by the
GStreamer demo
* Removing the dummy CPU access and mmap functions from the dma_buf_ops
* Fixing a compile error in "optee: FF-A: dynamic restricted memory allocation"
reported by kernel test robot <lkp(a)intel.com>
Changes since V3:
* Make the use_case and flags field in struct tee_shm u32's instead of
u16's
* Add more description for TEE_IOC_RSTMEM_ALLOC in the header file
* Import namespace DMA_BUF in module tee, reported by lkp(a)intel.com
* Added a note in the commit message for "optee: account for direction
while converting parameters" why it's needed
* Factor out dynamic restricted memory allocation from
"optee: support restricted memory allocation" into two new commits
"optee: FF-A: dynamic restricted memory allocation" and
"optee: smc abi: dynamic restricted memory allocation"
* Guard CMA usage with #ifdef CONFIG_CMA, effectively disabling dynamic
restricted memory allocate if CMA isn't configured
Changes since the V2 RFC:
* Based on v6.12
* Replaced the flags for SVP and Trusted UID memory with a u32 field with
unique id for each use case
* Added dynamic allocation of restricted memory pools
* Added OP-TEE ABI both with and without FF-A for dynamic restricted memory
* Added support for FF-A with FFA_LEND
Changes since the V1 RFC:
* Based on v6.11
* Complete rewrite, replacing the restricted heap with TEE_IOC_RSTMEM_ALLOC
Changes since Olivier's post [2]:
* Based on Yong Wu's post [1] where much of dma-buf handling is done in
the generic restricted heap
* Simplifications and cleanup
* New commit message for "dma-buf: heaps: add Linaro restricted dmabuf heap
support"
* Replaced the word "secure" with "restricted" where applicable
Jens Wiklander (7):
tee: add restricted memory allocation
tee: add TEE_IOC_RSTMEM_FD_INFO
optee: account for direction while converting parameters
optee: sync secure world ABI headers
optee: support restricted memory allocation
optee: FF-A: dynamic restricted memory allocation
optee: smc abi: dynamic restricted memory allocation
drivers/tee/Makefile | 1 +
drivers/tee/optee/Makefile | 1 +
drivers/tee/optee/call.c | 10 +-
drivers/tee/optee/core.c | 1 +
drivers/tee/optee/ffa_abi.c | 179 +++++++++++++-
drivers/tee/optee/optee_ffa.h | 27 ++-
drivers/tee/optee/optee_msg.h | 65 ++++-
drivers/tee/optee/optee_private.h | 75 ++++--
drivers/tee/optee/optee_smc.h | 71 +++++-
drivers/tee/optee/rpc.c | 31 ++-
drivers/tee/optee/rstmem.c | 389 ++++++++++++++++++++++++++++++
drivers/tee/optee/smc_abi.c | 215 +++++++++++++++--
drivers/tee/tee_core.c | 69 +++++-
drivers/tee/tee_private.h | 4 +
drivers/tee/tee_rstmem.c | 188 +++++++++++++++
drivers/tee/tee_shm.c | 2 +
drivers/tee/tee_shm_pool.c | 69 +++++-
include/linux/tee_core.h | 15 ++
include/linux/tee_drv.h | 2 +
include/uapi/linux/tee.h | 71 +++++-
20 files changed, 1409 insertions(+), 76 deletions(-)
create mode 100644 drivers/tee/optee/rstmem.c
create mode 100644 drivers/tee/tee_rstmem.c
base-commit: a64dcfb451e254085a7daee5fe51bf22959d52d3
--
2.43.0