Linaro-mm-sig March 2025

linaro-mm-sig@lists.linaro.org

16 participants
38 discussions

Re: [PATCH] udmabuf: fix a buf size overflow issue during udmabuf creation

by Christian König

Am 21.03.25 um 17:41 schrieb Xiaogang.Chen: > From: Xiaogang Chen <xiaogang.chen(a)amd.com> > > by casting size_limit_mb to u64 when calculate pglimit. > > Signed-off-by: Xiaogang Chen<Xiaogang.Chen(a)amd.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> If nobody objects I'm going to push that to drm-misc-fixes. Regards, Christian. > --- > drivers/dma-buf/udmabuf.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/udmabuf.c b/drivers/dma-buf/udmabuf.c > index 8ce1f074c2d3..e99e3a65a470 100644 > --- a/drivers/dma-buf/udmabuf.c > +++ b/drivers/dma-buf/udmabuf.c > @@ -398,7 +398,7 @@ static long udmabuf_create(struct miscdevice *device, > if (!ubuf) > return -ENOMEM; > > - pglimit = (size_limit_mb * 1024 * 1024) >> PAGE_SHIFT; > + pglimit = ((u64)size_limit_mb * 1024 * 1024) >> PAGE_SHIFT; > for (i = 0; i < head->count; i++) { > pgoff_t subpgcnt; >

3 months, 2 weeks

Re: [PATCH v2 7/7] accel/rocket: Add IOCTLs for synchronizing memory accesses

by Jeffrey Hugo

On 2/25/2025 12:55 AM, Tomeu Vizoso wrote: > +int rocket_ioctl_fini_bo(struct drm_device *dev, void *data, struct drm_file *file) > +{ > + struct drm_rocket_fini_bo *args = data; > + struct drm_gem_object *gem_obj; > + struct rocket_gem_object *rkt_obj; > + struct drm_gem_shmem_object *shmem_obj; > + struct rocket_device *rdev = to_rocket_device(dev); > + > + gem_obj = drm_gem_object_lookup(file, args->handle); > + if (!gem_obj) > + return -ENOENT; > + > + rkt_obj = to_rocket_bo(gem_obj); > + shmem_obj = &rkt_obj->base; > + > + WARN_ON(rkt_obj->last_cpu_prep_op == 0); > + > + for (unsigned int core = 1; core < rdev->num_cores; core++) { > + dma_sync_sgtable_for_device(rdev->cores[core].dev, shmem_obj->sgt, > + rocket_op_to_dma_dir(rkt_obj->last_cpu_prep_op)); > + } > + > + rkt_obj->last_cpu_prep_op = 0; > + > + drm_gem_object_put(gem_obj); > + > + return 0; > +} flags must be 0, and you must check that here. You do not appear to be doing that. Otherwise, userspace may put a value in flags, which is ignored now, but later when you define flags for a purpose, existing userspace will be broken - a uapi violation.

3 months, 3 weeks

Re: [PATCH v2 5/7] accel/rocket: Add IOCTL for BO creation

by Jeffrey Hugo

On 2/25/2025 12:55 AM, Tomeu Vizoso wrote: > +/** > + * rocket_gem_create_object - Implementation of driver->gem_create_object. > + * @dev: DRM device > + * @size: Size in bytes of the memory the object will reference > + * > + * This lets the GEM helpers allocate object structs for us, and keep > + * our BO stats correct. > + */ I would expect that this would throw a warning when making the kernel documentation since you are not describing the return value.

3 months, 3 weeks

Re: [PATCH v6 03/10] optee: account for direction while converting parameters

by Jens Wiklander

Hi Sumit, On Thu, Mar 20, 2025 at 10:25 AM Sumit Garg <sumit.garg(a)kernel.org> wrote: > > Hi Jens, > > On Mon, Mar 17, 2025 at 08:42:01AM +0100, Jens Wiklander wrote: > > Hi Sumit, > > > > On Thu, Mar 13, 2025 at 11:41 AM Sumit Garg <sumit.garg(a)kernel.org> wrote: > > > > > > Hi Jens, > > > > > > On Wed, Mar 05, 2025 at 02:04:09PM +0100, Jens Wiklander wrote: > > > > The OP-TEE backend driver has two internal function pointers to convert > > > > between the subsystem type struct tee_param and the OP-TEE type struct > > > > optee_msg_param. > > > > > > > > The conversion is done from one of the types to the other, which is then > > > > involved in some operation and finally converted back to the original > > > > type. When converting to prepare the parameters for the operation, all > > > > fields must be taken into account, but then converting back, it's enough > > > > to update only out-values and out-sizes. So, an update_out parameter is > > > > added to the conversion functions to tell if all or only some fields > > > > must be copied. > > > > > > > > This is needed in a later patch where it might get confusing when > > > > converting back in from_msg_param() callback since an allocated > > > > restricted SHM can be using the sec_world_id of the used restricted > > > > memory pool and that doesn't translate back well. > > > > > > > > Signed-off-by: Jens Wiklander <jens.wiklander(a)linaro.org> > > > > --- > > > > drivers/tee/optee/call.c | 10 ++-- > > > > drivers/tee/optee/ffa_abi.c | 43 +++++++++++++---- > > > > drivers/tee/optee/optee_private.h | 42 +++++++++++------ > > > > drivers/tee/optee/rpc.c | 31 +++++++++---- > > > > drivers/tee/optee/smc_abi.c | 76 +++++++++++++++++++++++-------- > > > > 5 files changed, 144 insertions(+), 58 deletions(-) > > > > > > > > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c > > > > index 16eb953e14bb..f1533b894726 100644 > > > > --- a/drivers/tee/optee/call.c > > > > +++ b/drivers/tee/optee/call.c > > > > @@ -400,7 +400,8 @@ int optee_open_session(struct tee_context *ctx, > > > > export_uuid(msg_arg->params[1].u.octets, &client_uuid); > > > > > > > > rc = optee->ops->to_msg_param(optee, msg_arg->params + 2, > > > > - arg->num_params, param); > > > > + arg->num_params, param, > > > > + false /*!update_out*/); > > > > if (rc) > > > > goto out; > > > > > > > > @@ -427,7 +428,8 @@ int optee_open_session(struct tee_context *ctx, > > > > } > > > > > > > > if (optee->ops->from_msg_param(optee, param, arg->num_params, > > > > - msg_arg->params + 2)) { > > > > + msg_arg->params + 2, > > > > + true /*update_out*/)) { > > > > arg->ret = TEEC_ERROR_COMMUNICATION; > > > > arg->ret_origin = TEEC_ORIGIN_COMMS; > > > > /* Close session again to avoid leakage */ > > > > @@ -541,7 +543,7 @@ int optee_invoke_func(struct tee_context *ctx, struct tee_ioctl_invoke_arg *arg, > > > > msg_arg->cancel_id = arg->cancel_id; > > > > > > > > rc = optee->ops->to_msg_param(optee, msg_arg->params, arg->num_params, > > > > - param); > > > > + param, false /*!update_out*/); > > > > if (rc) > > > > goto out; > > > > > > > > @@ -551,7 +553,7 @@ int optee_invoke_func(struct tee_context *ctx, struct tee_ioctl_invoke_arg *arg, > > > > } > > > > > > > > if (optee->ops->from_msg_param(optee, param, arg->num_params, > > > > - msg_arg->params)) { > > > > + msg_arg->params, true /*update_out*/)) { > > > > msg_arg->ret = TEEC_ERROR_COMMUNICATION; > > > > msg_arg->ret_origin = TEEC_ORIGIN_COMMS; > > > > } > > > > diff --git a/drivers/tee/optee/ffa_abi.c b/drivers/tee/optee/ffa_abi.c > > > > index 4ca1d5161b82..e4b08cd195f3 100644 > > > > --- a/drivers/tee/optee/ffa_abi.c > > > > +++ b/drivers/tee/optee/ffa_abi.c > > > > @@ -122,15 +122,21 @@ static int optee_shm_rem_ffa_handle(struct optee *optee, u64 global_id) > > > > */ > > > > > > > > static void from_msg_param_ffa_mem(struct optee *optee, struct tee_param *p, > > > > - u32 attr, const struct optee_msg_param *mp) > > > > + u32 attr, const struct optee_msg_param *mp, > > > > + bool update_out) > > > > { > > > > struct tee_shm *shm = NULL; > > > > u64 offs_high = 0; > > > > u64 offs_low = 0; > > > > > > > > + if (update_out) { > > > > + if (attr == OPTEE_MSG_ATTR_TYPE_FMEM_INPUT) > > > > + return; > > > > + goto out; > > > > + } > > > > + > > > > p->attr = TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT + > > > > attr - OPTEE_MSG_ATTR_TYPE_FMEM_INPUT; > > > > - p->u.memref.size = mp->u.fmem.size; > > > > > > > > if (mp->u.fmem.global_id != OPTEE_MSG_FMEM_INVALID_GLOBAL_ID) > > > > shm = optee_shm_from_ffa_handle(optee, mp->u.fmem.global_id); > > > > @@ -141,6 +147,8 @@ static void from_msg_param_ffa_mem(struct optee *optee, struct tee_param *p, > > > > offs_high = mp->u.fmem.offs_high; > > > > } > > > > p->u.memref.shm_offs = offs_low | offs_high << 32; > > > > +out: > > > > + p->u.memref.size = mp->u.fmem.size; > > > > } > > > > > > > > /** > > > > @@ -150,12 +158,14 @@ static void from_msg_param_ffa_mem(struct optee *optee, struct tee_param *p, > > > > * @params: subsystem internal parameter representation > > > > * @num_params: number of elements in the parameter arrays > > > > * @msg_params: OPTEE_MSG parameters > > > > + * @update_out: update parameter for output only > > > > * > > > > * Returns 0 on success or <0 on failure > > > > */ > > > > static int optee_ffa_from_msg_param(struct optee *optee, > > > > struct tee_param *params, size_t num_params, > > > > - const struct optee_msg_param *msg_params) > > > > + const struct optee_msg_param *msg_params, > > > > + bool update_out) > > > > { > > > > size_t n; > > > > > > > > @@ -166,18 +176,20 @@ static int optee_ffa_from_msg_param(struct optee *optee, > > > > > > > > switch (attr) { > > > > case OPTEE_MSG_ATTR_TYPE_NONE: > > > > + if (update_out) > > > > + break; > > > > p->attr = TEE_IOCTL_PARAM_ATTR_TYPE_NONE; > > > > memset(&p->u, 0, sizeof(p->u)); > > > > break; > > > > case OPTEE_MSG_ATTR_TYPE_VALUE_INPUT: > > > > case OPTEE_MSG_ATTR_TYPE_VALUE_OUTPUT: > > > > case OPTEE_MSG_ATTR_TYPE_VALUE_INOUT: > > > > - optee_from_msg_param_value(p, attr, mp); > > > > + optee_from_msg_param_value(p, attr, mp, update_out); > > > > break; > > > > case OPTEE_MSG_ATTR_TYPE_FMEM_INPUT: > > > > case OPTEE_MSG_ATTR_TYPE_FMEM_OUTPUT: > > > > case OPTEE_MSG_ATTR_TYPE_FMEM_INOUT: > > > > - from_msg_param_ffa_mem(optee, p, attr, mp); > > > > + from_msg_param_ffa_mem(optee, p, attr, mp, update_out); > > > > break; > > > > default: > > > > return -EINVAL; > > > > @@ -188,10 +200,16 @@ static int optee_ffa_from_msg_param(struct optee *optee, > > > > } > > > > > > > > static int to_msg_param_ffa_mem(struct optee_msg_param *mp, > > > > - const struct tee_param *p) > > > > + const struct tee_param *p, bool update_out) > > > > { > > > > struct tee_shm *shm = p->u.memref.shm; > > > > > > > > + if (update_out) { > > > > + if (p->attr == TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT) > > > > + return 0; > > > > + goto out; > > > > + } > > > > + > > > > mp->attr = OPTEE_MSG_ATTR_TYPE_FMEM_INPUT + p->attr - > > > > TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT; > > > > > > > > @@ -211,6 +229,7 @@ static int to_msg_param_ffa_mem(struct optee_msg_param *mp, > > > > memset(&mp->u, 0, sizeof(mp->u)); > > > > mp->u.fmem.global_id = OPTEE_MSG_FMEM_INVALID_GLOBAL_ID; > > > > } > > > > +out: > > > > mp->u.fmem.size = p->u.memref.size; > > > > > > > > return 0; > > > > @@ -222,13 +241,15 @@ static int to_msg_param_ffa_mem(struct optee_msg_param *mp, > > > > * @optee: main service struct > > > > * @msg_params: OPTEE_MSG parameters > > > > * @num_params: number of elements in the parameter arrays > > > > - * @params: subsystem itnernal parameter representation > > > > + * @params: subsystem internal parameter representation > > > > + * @update_out: update parameter for output only > > > > * Returns 0 on success or <0 on failure > > > > */ > > > > static int optee_ffa_to_msg_param(struct optee *optee, > > > > struct optee_msg_param *msg_params, > > > > size_t num_params, > > > > - const struct tee_param *params) > > > > + const struct tee_param *params, > > > > + bool update_out) > > > > { > > > > size_t n; > > > > > > > > @@ -238,18 +259,20 @@ static int optee_ffa_to_msg_param(struct optee *optee, > > > > > > > > switch (p->attr) { > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_NONE: > > > > + if (update_out) > > > > + break; > > > > mp->attr = TEE_IOCTL_PARAM_ATTR_TYPE_NONE; > > > > memset(&mp->u, 0, sizeof(mp->u)); > > > > break; > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INPUT: > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_OUTPUT: > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT: > > > > - optee_to_msg_param_value(mp, p); > > > > + optee_to_msg_param_value(mp, p, update_out); > > > > break; > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT: > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_OUTPUT: > > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT: > > > > - if (to_msg_param_ffa_mem(mp, p)) > > > > + if (to_msg_param_ffa_mem(mp, p, update_out)) > > > > return -EINVAL; > > > > break; > > > > default: > > > > > > Can we rather handle it as follows to improve code readability and > > > maintainence long term? Ditto for all other places. > > > > > > static int optee_ffa_to_msg_param(struct optee *optee, > > > struct optee_msg_param *msg_params, > > > size_t num_params, > > > const struct tee_param *params, > > > bool update_out) > > > { > > > size_t n; > > > > > > for (n = 0; n < num_params; n++) { > > > const struct tee_param *p = params + n; > > > struct optee_msg_param *mp = msg_params + n; > > > > > > if (update_out && (p->attr == TEE_IOCTL_PARAM_ATTR_TYPE_NONE || > > > p->attr == TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INPUT || > > > p->attr == TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT)) > > > continue; > > > > You're missing updating the length field for memrefs. > > > > Do we need to update length field for input memrefs when update_out is > set? I don't see that happening in your existing patch too. I'm sorry, I was unclear. The update_out parameter means only the output fields should be updated, not the attribute, offsets, ids, etc. That is, the length field for memrefs, and the value fields a, b, c for value params. Some of the memrefs aren't translated one-to-one with SDP, but the length field can and must be updated. Cheers, Jens > > -Sumit > > > Cheers, > > Jens > > > > > > > > switch (p->attr) { > > > case TEE_IOCTL_PARAM_ATTR_TYPE_NONE: > > > mp->attr = TEE_IOCTL_PARAM_ATTR_TYPE_NONE; > > > memset(&mp->u, 0, sizeof(mp->u)); > > > break; > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INPUT: > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_OUTPUT: > > > case TEE_IOCTL_PARAM_ATTR_TYPE_VALUE_INOUT: > > > optee_to_msg_param_value(mp, p); > > > break; > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INPUT: > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_OUTPUT: > > > case TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT: > > > if (to_msg_param_ffa_mem(mp, p)) > > > return -EINVAL; > > > break; > > > default: > > > return -EINVAL; > > > } > > > } > > > > > > return 0; > > > } > > > > > > -Sumit

3 months, 3 weeks

[PATCH v2 00/34] drm/msm: sparse / "VM_BIND" support

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. Extending the SUBMIT ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue The UABI takes a slightly different approach from what other drivers have done, and what would make sense if starting from a clean sheet, ie separate VM_BIND and EXEC ioctls. But since we have to maintain support for the existing SUBMIT ioctl, and because the fence, syncobj, and BO pinning is largely the same between legacy "BO-table" style SUBMIT ioctls, and new- style VM updates submitted to a VM_BIND submitqueue, I chose to go the route of extending the existing `SUBMIT` ioctl rather than adding a new ioctl. I also did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel. Synchronous/immediate VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue. The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533 This series can be found in MR form, if you prefer: https://gitlab.freedesktop.org/drm/msm/-/merge_requests/144 Changes in v2: - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been merged. - Pre-allocate all the things, and drop HACK patch which disabled shrinker. This includes ensuring that vm_bo objects are allocated up front, pre- allocating VMA objects, and pre-allocating pages used for pgtable updates. The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that were initially added for panthor. - Add back support for BO dumping for devcoredump. - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c… [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 Rob Clark (34): drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Remove bogus lock assert drm/gpuvm: Allow VAs to hold soft reference to BOs drm/gpuvm: Add drm_gpuvm_sm_unmap_va() drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: drm_gpuvm conversion drm/msm: Use drm_gpuvm types more drm/msm: Split submit_pin_objects() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on faults drm/msm: Extend SUBMIT ioctl for VM_BIND drm/msm: Add VM_BIND submitqueue drm/msm: Add _NO_SHARE flag drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Split msm_gem_vma_new() drm/msm: Pre-allocate VMAs drm/msm: Pre-allocate vm_bo objects drm/msm: Pre-allocate pages for pgtable entries drm/msm: Wire up gpuvm ops drm/msm: Wire up drm_gpuvm debugfs drm/msm: Crashdump prep for sparse mappings drm/msm: rd dumping prep for sparse mappings drm/msm: Crashdec support for sparse drm/msm: rd dumping support for sparse drm/msm: Bump UAPI version drivers/gpu/drm/drm_gpuvm.c | 141 ++-- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 24 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 51 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 - drivers/gpu/drm/msm/adreno/adreno_gpu.c | 84 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_debugfs.c | 20 + drivers/gpu/drm/msm/msm_drv.c | 176 ++--- drivers/gpu/drm/msm/msm_drv.h | 35 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 437 +++++----- drivers/gpu/drm/msm/msm_gem.h | 226 ++++-- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_submit.c | 234 +++++- drivers/gpu/drm/msm/msm_gem_vma.c | 748 ++++++++++++++++-- drivers/gpu/drm/msm/msm_gpu.c | 146 ++-- drivers/gpu/drm/msm/msm_gpu.h | 132 +++- drivers/gpu/drm/msm/msm_iommu.c | 285 ++++++- drivers/gpu/drm/msm/msm_kms.c | 18 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 38 +- drivers/gpu/drm/msm/msm_rd.c | 62 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/gpu/drm/msm/msm_submitqueue.c | 86 +- include/drm/drm_gpuvm.h | 14 +- include/uapi/drm/msm_drm.h | 98 ++- 52 files changed, 2359 insertions(+), 1060 deletions(-) -- 2.48.1

3 months, 3 weeks

Re: [PATCH 5/7] drm/gem: Add a flag to control whether objects can be exported

by Christian König

Am 18.03.25 um 20:22 schrieb Daniel Almeida: > From: Asahi Lina <lina(a)asahilina.net> > > Drivers may want to support driver-private objects, which cannot be > shared. This allows them to share a single lock and enables other > optimizations. > > Add an `exportable` field to drm_gem_object, which blocks PRIME export > if set to false. It is initialized to true in > drm_gem_private_object_init. We already have a method for doing that which is used by almost all drivers (except for lsdc). Basically you just create a function which checks the per-requisites if a buffer can be exported before calling drm_gem_prime_export() and installs that as .export callback into the drm_gem_object_funcs. See amdgpu_gem_prime_export() for a simpler example. Regards, Christian. > > Signed-off-by: Asahi Lina <lina(a)asahilina.net> > Signed-off-by: Daniel Almeida <daniel.almeida(a)collabora.com> > --- > drivers/gpu/drm/drm_gem.c | 1 + > drivers/gpu/drm/drm_prime.c | 5 +++++ > include/drm/drm_gem.h | 8 ++++++++ > 3 files changed, 14 insertions(+) > > diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c > index ee811764c3df4b4e9c377a66afd4967512ba2001..8f998fe6beecd285ce3e2d5badfa95eb7d7bd548 100644 > --- a/drivers/gpu/drm/drm_gem.c > +++ b/drivers/gpu/drm/drm_gem.c > @@ -195,6 +195,7 @@ void drm_gem_private_object_init(struct drm_device *dev, > > drm_vma_node_reset(&obj->vma_node); > INIT_LIST_HEAD(&obj->lru_node); > + obj->exportable = true; > } > EXPORT_SYMBOL(drm_gem_private_object_init); > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 32a8781cfd67b82ece7b7b94625715171bb41917..20aa350280abe9a6ed6742e131ff50c65bc9dfa9 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -387,6 +387,11 @@ static struct dma_buf *export_and_register_object(struct drm_device *dev, > return dmabuf; > } > > + if (!obj->exportable) { > + dmabuf = ERR_PTR(-EINVAL); > + return dmabuf; > + } > + > if (obj->funcs && obj->funcs->export) > dmabuf = obj->funcs->export(obj, flags); > else > diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h > index fdae947682cd0b7b06db5e35e120f049a0f30179..f700e4996eccb92597cca6b8c3df8e35b864c1e1 100644 > --- a/include/drm/drm_gem.h > +++ b/include/drm/drm_gem.h > @@ -432,6 +432,14 @@ struct drm_gem_object { > * The current LRU list that the GEM object is on. > */ > struct drm_gem_lru *lru; > + > + /** > + * @exportable: > + * > + * Whether this GEM object can be exported via the drm_gem_object_funcs->export > + * callback. Defaults to true. > + */ > + bool exportable; > }; > > /** >

3 months, 3 weeks

Re: [PATCH 2/7] drm/gem-shmem: Export VM ops functions

by Christian König

Am 18.03.25 um 20:22 schrieb Daniel Almeida: > From: Asahi Lina <lina(a)asahilina.net> > > There doesn't seem to be a way for the Rust bindings to get a > compile-time constant reference to drm_gem_shmem_vm_ops, so we need to > duplicate that structure in Rust... this isn't nice... Well "isn't nice" is an understatement. We can have that as a short term hack, but I don't think that this is a doable long term solution. For this particular case here it most likely doesn't matter, but operation pointer structures are often used to identify a certain class of object. So exporting the functions and then re-creating the constant operation pointer structure in Rust doesn't work in some cases. Regards, Christian. > > Signed-off-by: Asahi Lina <lina(a)asahilina.net> > Signed-off-by: Daniel Almeida <daniel.almeida(a)collabora.com> > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 9 ++++++--- > include/drm/drm_gem_shmem_helper.h | 3 +++ > 2 files changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index ec89e9499f5f02a2a35713669bf649dd2abb9938..be310db5863871604f3502ad1f419937d4c20a84 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -535,7 +535,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, > } > EXPORT_SYMBOL_GPL(drm_gem_shmem_dumb_create); > > -static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf) > +vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf) > { > struct vm_area_struct *vma = vmf->vma; > struct drm_gem_object *obj = vma->vm_private_data; > @@ -564,8 +564,9 @@ static vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf) > > return ret; > } > +EXPORT_SYMBOL_GPL(drm_gem_shmem_fault); > > -static void drm_gem_shmem_vm_open(struct vm_area_struct *vma) > +void drm_gem_shmem_vm_open(struct vm_area_struct *vma) > { > struct drm_gem_object *obj = vma->vm_private_data; > struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); > @@ -586,8 +587,9 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct *vma) > > drm_gem_vm_open(vma); > } > +EXPORT_SYMBOL_GPL(drm_gem_shmem_vm_open); > > -static void drm_gem_shmem_vm_close(struct vm_area_struct *vma) > +void drm_gem_shmem_vm_close(struct vm_area_struct *vma) > { > struct drm_gem_object *obj = vma->vm_private_data; > struct drm_gem_shmem_object *shmem = to_drm_gem_shmem_obj(obj); > @@ -598,6 +600,7 @@ static void drm_gem_shmem_vm_close(struct vm_area_struct *vma) > > drm_gem_vm_close(vma); > } > +EXPORT_SYMBOL_GPL(drm_gem_shmem_vm_close); > > const struct vm_operations_struct drm_gem_shmem_vm_ops = { > .fault = drm_gem_shmem_fault, > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > index d22e3fb53631ab655748d7f6c628ffdb402f6324..b70d3cc35bd194e7cd718bee220408b5dda568bf 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -132,6 +132,9 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem, > struct drm_printer *p, unsigned int indent); > > extern const struct vm_operations_struct drm_gem_shmem_vm_ops; > +vm_fault_t drm_gem_shmem_fault(struct vm_fault *vmf); > +void drm_gem_shmem_vm_open(struct vm_area_struct *vma); > +void drm_gem_shmem_vm_close(struct vm_area_struct *vma); > > /* > * GEM object functions >

3 months, 3 weeks

Re: [PATCH 2/7] drm/gem-shmem: Export VM ops functions

by Christian König

3 months, 3 weeks

Re: [PATCH v4 0/6] TEE subsystem for restricted dma-buf allocations

by Nicolas Dufresne

Le mardi 04 mars 2025 à 13:15 +0530, Sumit Garg a écrit : > On Tue, Mar 04, 2025 at 08:17:23AM +0100, Jens Wiklander wrote: > > Hi Daniel, > > > > On Fri, Feb 21, 2025 at 3:12 PM Daniel Stone <daniel(a)fooishbar.org> wrote: > > > > > > Hi Sumit, > > > > > > On Fri, 21 Feb 2025 at 11:24, Sumit Garg <sumit.garg(a)linaro.org> wrote: > > > > On Tue, 18 Feb 2025 at 21:52, Daniel Stone <daniel(a)fooishbar.org> wrote: > > > > > dma-heaps was created to solve the problem of having too many > > > > > 'allocate $n bytes from $specialplace' uAPIs. The proliferation was > > > > > painful and making it difficult for userspace to do what it needed to > > > > > do. Userspace doesn't _yet_ make full use of it, but the solution is > > > > > to make userspace make full use of it, not to go create entirely > > > > > separate allocation paths for unclear reasons. > > > > > > > > > > Besides, I'm writing this from a platform that implements SVP not via > > > > > TEE. I've worked on platforms which implement SVP without any TEE, > > > > > where the TEE implementation would be at best a no-op stub, and at > > > > > worst flat-out impossible. > > > > > > > > Can you elaborate the non-TEE use-case for Secure Video Path (SVP) a > > > > bit more? As to how the protected/encrypted media content pipeline > > > > works? Which architecture support does your use-case require? Is there > > > > any higher privileged level firmware interaction required to perform > > > > media content decryption into restricted memory? Do you plan to > > > > upstream corresponding support in near future? > > > > > > You can see the MTK SVP patches on list which use the MTK SMC to mediate it. > > > > > > There are TI Jacinto platforms which implement a 'secure' area > > > configured statically by (IIRC) BL2, with static permissions defined > > > for each AXI endpoint, e.g. CPU write + codec RW + dispc read. I've > > > heard of another SoC vendor doing the same, but I don't think I can > > > share those details. There is no TEE interaction. > > > > > > I'm writing this message from an AMD laptop which implements > > > restricted content paths outside of TEE. I don't have the full picture > > > of how SVP is implemented on AMD systems, but I do know that I don't > > > have any TEE devices exposed. > > > > > > > Let me try to elaborate on the Secure Video Path (SVP) flow requiring > > > > a TEE implementation (in general terms a higher privileged firmware > > > > managing the pipeline as the kernel/user-space has no access > > > > permissions to the plain text media content): > > > > > > > > - [...] > > > > > > Yeah, I totally understand the TEE usecase. I think that TEE is a good > > > design to implement this. I think that TEE should be used for SVP > > > where it makes sense. > > > > > > Please understand that I am _not_ arguing that no-one should use TEE for SVP! > > > > > > > > So, again, let's > > > > > please turn this around: _why_ TEE? Who benefits from exposing this as > > > > > completely separate to the more generic uAPI that we specifically > > > > > designed to handle things like this? > > > > > > > > The bridging between DMA heaps and TEE would still require user-space > > > > to perform an IOCTL into TEE to register the DMA-bufs as you can see > > > > here [1]. Then it will rather be two handles for user-space to manage. > > > > > > Yes, the decoder would need to do this. That's common though: if you > > > want to share a buffer between V4L2 and DRM, you have three handles: > > > the V4L2 buffer handle, the DRM GEM handle, and the dmabuf you use to > > > bridge the two. > > > > > > > Similarly during restricted memory allocation/free we need another > > > > glue layer under DMA heaps to TEE subsystem. > > > > > > Yep. > > > > > > > The reason is simply which has been iterated over many times in the > > > > past threads that: > > > > > > > > "If user-space has to interact with a TEE device for SVP use-case > > > > then why it's not better to ask TEE to allocate restricted DMA-bufs > > > > too" > > > > > > The first word in your proposition is load-bearing. > > > > > > Build out the usecase a little more here. You have a DRMed video > > > stream coming in, which you need to decode (involving TEE for this > > > usecase). You get a dmabuf handle to the decoded frame. You need to > > > pass the dmabuf across to the Wayland compositor. The compositor needs > > > to pass it to EGL/Vulkan to import and do composition, which in turn > > > passes it to the GPU DRM driver. The output of the composition is in > > > turn shared between the GPU DRM driver and the separate KMS DRM > > > driver, with the involvement of GBM. > > > > > > For the platforms I'm interested in, the GPU DRM driver needs to > > > switch into protected mode, which has no involvement at all with TEE - > > > it's architecturally impossible to have TEE involved without moving > > > most of the GPU driver into TEE and destroying performance. The > > > display hardware also needs to engage protected mode, which again has > > > no involvement with TEE and again would need to have half the driver > > > moved into TEE for no benefit in order to do so. The Wayland > > > compositor also has no interest in TEE: it tells the GPU DRM driver > > > about the protected status of its buffers, and that's it. > > > > > > What these components _are_ opinionated about, is the way buffers are > > > allocated and managed. We built out dmabuf modifiers for this usecase, > > > and we have a good negotiation protocol around that. We also really > > > care about buffer placement in some usecases - e.g. some display/codec > > > hardware requires buffers to be sourced from contiguous memory, other > > > hardware needs to know that when it shares buffers with another > > > device, it needs to place the buffers outside of inaccessible/slow > > > local RAM. So we built out dma-heaps, so every part of the component > > > in the stack can communicate their buffer-placement needs in the same > > > way as we do modifiers, and negotiate an acceptable allocation. > > > > > > That's my starting point for this discussion. We have a mechanism to > > > deal with the fact that buffers need to be shared between different IP > > > blocks which have their own constraints on buffer placement, avoiding > > > the current problem of having every subsystem reinvent their own > > > allocation uAPI which was burying us in impedance mismatch and > > > confusion. That mechanism is dma-heaps. It seems like your starting > > > point from this discussion is that you've implemented a TEE-centric > > > design for SVP, and so all of userspace should bypass our existing > > > cross-subsystem special-purpose allocation mechanism, and write > > > specifically to one implementation. I believe that is a massive step > > > backwards and an immediate introduction of technical debt. > > > > > > Again, having an implementation of SVP via TEE makes a huge amount of > > > sense. Having _most_ SVP implementations via TEE still makes a lot of > > > sense. Having _all_ SVP implementations eventually be via TEE would > > > still make sense. But even if we were at that point - which we aren't > > > - it still doesn't justify telling userspace 'use the generic dma-heap > > > uAPI for every device-specific allocation constraint, apart from SVP > > > which has a completely different way to allocate some bytes'. > > > > I must admit that I don't see how this makes a significant difference, > > but then I haven't hacked much in the stacks you're talking about, so > > I'm going to take your word for it. > > > > I've experimented with providing a dma-heap replacing the TEE API. The > > implementation is more complex than I first anticipated, adding about > > 400 lines to the patch set. > > I did anticipated this but let's give it a try and see if DMA heaps > really adds any value from user-space point of view. If it does then it > will be worth the maintenence overhead. > > > From user space, it looks like another > > dma-heap. I'm using the names you gave earlier, > > protected,secure-video, protected,trusted-ui, and > > protected,secure-video-record. However, I wonder if we shouldn't use > > "restricted" instead of "protected" since we had agreed to call it > > restricted memory earlier. > > Let's stick with "restricted" memory buffer references only. Until now, we didn't have a standard to balance our naming choice, we simply wanted to move away from "secure" which didn't mean much, and restricted met our needs. I think the discussion is worth having again, now that there is a standard that decided toward "protected". Matchcing the Khronos standard means reducing a lot of confusion. https://docs.vulkan.org/guide/latest/protected.html regards, Nicolas

3 months, 3 weeks

Re: [RFC PATCH 08/12] vfio/pci: Create host unaccessible dma-buf for private device

by Jason Gunthorpe

On Mon, Jan 20, 2025 at 08:45:51PM +1100, Alexey Kardashevskiy wrote: > > For CC I'm expecting the KVM fd to be the handle for the cVM, so any > > RPCs that want to call into the secure world need the KVM FD to get > > the cVM's identifier. Ie a "bind to cVM" RPC will need the PCI > > information and the cVM's handle. > > And keep KVM fd open until unbind? Or just for the short time to call the > PSP? iommufd will keep the KVM fd alive so long as the vIOMMU object exists. Other uses for kvm require it to work like this. > > But it also seems to me that VFIO should be able to support putting > > the device into the RUN state without involving KVM or cVMs. > > AMD's TDI bind handler in the PSP wants a guest handle ("GCTX") and a guest > device BDFn, and VFIO has no desire to dive into this KVM business beyond > IOMMUFD. As in my other email, VFIO is not restricted to running VMs, useful things should be available to apps like DPDK. There is a use case for using TDISP and getting devices up into an ecrypted/attested state on pure bare metal without any KVM, VFIO should work in that use case too. Jason

3 months, 3 weeks

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig March 2025