- Linaro-mm-sig - lists.linaro.org

[PATCH AUTOSEL 6.14 241/642] drm/gem: Test for imported GEM buffers with helper

by Sasha Levin

From: Thomas Zimmermann <tzimmermann(a)suse.de> [ Upstream commit b57aa47d39e94dc47403a745e2024664e544078c ] Add drm_gem_is_imported() that tests if a GEM object's buffer has been imported. Update the GEM code accordingly. GEM code usually tests for imports if import_attach has been set in struct drm_gem_object. But attaching a dma-buf on import requires a DMA-capable importer device, which is not the case for many serial busses like USB or I2C. The new helper tests if a GEM object's dma-buf has been created from the GEM object. Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de> Reviewed-by: Anusha Srivatsa <asrivats(a)redhat.com> Reviewed-by: Christian König <christian.koenig(a)amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226172457.217725-2-tzimm… Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/gpu/drm/drm_gem.c | 4 ++-- include/drm/drm_gem.h | 14 ++++++++++++++ 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c index ee811764c3df4..c6240bab3fa55 100644 --- a/drivers/gpu/drm/drm_gem.c +++ b/drivers/gpu/drm/drm_gem.c @@ -348,7 +348,7 @@ int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev, return -ENOENT; /* Don't allow imported objects to be mapped */ - if (obj->import_attach) { + if (drm_gem_is_imported(obj)) { ret = -EINVAL; goto out; } @@ -1178,7 +1178,7 @@ void drm_gem_print_info(struct drm_printer *p, unsigned int indent, drm_vma_node_start(&obj->vma_node)); drm_printf_indent(p, indent, "size=%zu\n", obj->size); drm_printf_indent(p, indent, "imported=%s\n", - str_yes_no(obj->import_attach)); + str_yes_no(drm_gem_is_imported(obj))); if (obj->funcs->print_info) obj->funcs->print_info(p, indent, obj); diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h index fdae947682cd0..2bf893eabb4b2 100644 --- a/include/drm/drm_gem.h +++ b/include/drm/drm_gem.h @@ -35,6 +35,7 @@ */ #include <linux/kref.h> +#include <linux/dma-buf.h> #include <linux/dma-resv.h> #include <linux/list.h> #include <linux/mutex.h> @@ -575,6 +576,19 @@ static inline bool drm_gem_object_is_shared_for_memory_stats(struct drm_gem_obje return (obj->handle_count > 1) || obj->dma_buf; } +/** + * drm_gem_is_imported() - Tests if GEM object's buffer has been imported + * @obj: the GEM object + * + * Returns: + * True if the GEM object's buffer has been imported, false otherwise + */ +static inline bool drm_gem_is_imported(const struct drm_gem_object *obj) +{ + /* The dma-buf's priv field points to the original GEM object. */ + return obj->dma_buf && (obj->dma_buf->priv != obj); +} + #ifdef CONFIG_LOCKDEP /** * drm_gem_gpuva_set_lock() - Set the lock protecting accesses to the gpuva list. -- 2.39.5

9 months, 2 weeks

1
0
0 0

[PATCH AUTOSEL 6.14 061/642] drm/amdgpu: use GFP_NOWAIT for memory allocations

by Sasha Levin

From: Christian König <christian.koenig(a)amd.com> [ Upstream commit 16590745b571c07869ef8958e0bbe44ab6f08d1f ] In the critical submission path memory allocations can't wait for reclaim since that can potentially wait for submissions to finish. Finally clean that up and mark most memory allocations in the critical path with GFP_NOWAIT. The only exception left is the dma_fence_array() used when no VMID is available, but that will be cleaned up later on. Signed-off-by: Christian König <christian.koenig(a)amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 8 ++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +++++++++++------- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 11 +++++++---- drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 11 ++++++----- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h | 3 ++- 6 files changed, 32 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 1e998f972c308..70224b9f54f2f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -499,7 +499,7 @@ static int vm_update_pds(struct amdgpu_vm *vm, struct amdgpu_sync *sync) if (ret) return ret; - return amdgpu_sync_fence(sync, vm->last_update); + return amdgpu_sync_fence(sync, vm->last_update, GFP_KERNEL); } static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem) @@ -1263,7 +1263,7 @@ static int unmap_bo_from_gpuvm(struct kgd_mem *mem, (void)amdgpu_vm_clear_freed(adev, vm, &bo_va->last_pt_update); - (void)amdgpu_sync_fence(sync, bo_va->last_pt_update); + (void)amdgpu_sync_fence(sync, bo_va->last_pt_update, GFP_KERNEL); return 0; } @@ -1287,7 +1287,7 @@ static int update_gpuvm_pte(struct kgd_mem *mem, return ret; } - return amdgpu_sync_fence(sync, bo_va->last_pt_update); + return amdgpu_sync_fence(sync, bo_va->last_pt_update, GFP_KERNEL); } static int map_bo_to_gpuvm(struct kgd_mem *mem, @@ -2969,7 +2969,7 @@ int amdgpu_amdkfd_gpuvm_restore_process_bos(void *info, struct dma_fence __rcu * } dma_resv_for_each_fence(&cursor, bo->tbo.base.resv, DMA_RESV_USAGE_KERNEL, fence) { - ret = amdgpu_sync_fence(&sync_obj, fence); + ret = amdgpu_sync_fence(&sync_obj, fence, GFP_KERNEL); if (ret) { pr_debug("Memory eviction: Sync BO fence failed. Try again\n"); goto validate_map_fail; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 5cc5f59e30184..4a5b406601fa2 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -428,7 +428,7 @@ static int amdgpu_cs_p2_dependencies(struct amdgpu_cs_parser *p, dma_fence_put(old); } - r = amdgpu_sync_fence(&p->sync, fence); + r = amdgpu_sync_fence(&p->sync, fence, GFP_KERNEL); dma_fence_put(fence); if (r) return r; @@ -450,7 +450,7 @@ static int amdgpu_syncobj_lookup_and_add(struct amdgpu_cs_parser *p, return r; } - r = amdgpu_sync_fence(&p->sync, fence); + r = amdgpu_sync_fence(&p->sync, fence, GFP_KERNEL); dma_fence_put(fence); return r; } @@ -1124,7 +1124,8 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (r) return r; - r = amdgpu_sync_fence(&p->sync, fpriv->prt_va->last_pt_update); + r = amdgpu_sync_fence(&p->sync, fpriv->prt_va->last_pt_update, + GFP_KERNEL); if (r) return r; @@ -1135,7 +1136,8 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (r) return r; - r = amdgpu_sync_fence(&p->sync, bo_va->last_pt_update); + r = amdgpu_sync_fence(&p->sync, bo_va->last_pt_update, + GFP_KERNEL); if (r) return r; } @@ -1154,7 +1156,8 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (r) return r; - r = amdgpu_sync_fence(&p->sync, bo_va->last_pt_update); + r = amdgpu_sync_fence(&p->sync, bo_va->last_pt_update, + GFP_KERNEL); if (r) return r; } @@ -1167,7 +1170,7 @@ static int amdgpu_cs_vm_handling(struct amdgpu_cs_parser *p) if (r) return r; - r = amdgpu_sync_fence(&p->sync, vm->last_update); + r = amdgpu_sync_fence(&p->sync, vm->last_update, GFP_KERNEL); if (r) return r; @@ -1248,7 +1251,8 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p) continue; } - r = amdgpu_sync_fence(&p->gang_leader->explicit_sync, fence); + r = amdgpu_sync_fence(&p->gang_leader->explicit_sync, fence, + GFP_KERNEL); dma_fence_put(fence); if (r) return r; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index 9008b7388e897..92ab821afc06a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -209,7 +209,7 @@ static int amdgpu_vmid_grab_idle(struct amdgpu_ring *ring, return 0; } - fences = kmalloc_array(id_mgr->num_ids, sizeof(void *), GFP_KERNEL); + fences = kmalloc_array(id_mgr->num_ids, sizeof(void *), GFP_NOWAIT); if (!fences) return -ENOMEM; @@ -313,7 +313,8 @@ static int amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm, /* Good we can use this VMID. Remember this submission as * user of the VMID. */ - r = amdgpu_sync_fence(&(*id)->active, &job->base.s_fence->finished); + r = amdgpu_sync_fence(&(*id)->active, &job->base.s_fence->finished, + GFP_NOWAIT); if (r) return r; @@ -372,7 +373,8 @@ static int amdgpu_vmid_grab_used(struct amdgpu_vm *vm, * user of the VMID. */ r = amdgpu_sync_fence(&(*id)->active, - &job->base.s_fence->finished); + &job->base.s_fence->finished, + GFP_NOWAIT); if (r) return r; @@ -424,7 +426,8 @@ int amdgpu_vmid_grab(struct amdgpu_vm *vm, struct amdgpu_ring *ring, /* Remember this submission as user of the VMID */ r = amdgpu_sync_fence(&id->active, - &job->base.s_fence->finished); + &job->base.s_fence->finished, + GFP_NOWAIT); if (r) goto error; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c index 6fa20980a0b15..e4251d0691c9c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c @@ -1335,14 +1335,14 @@ int amdgpu_mes_ctx_map_meta_data(struct amdgpu_device *adev, DRM_ERROR("failed to do vm_bo_update on meta data\n"); goto error_del_bo_va; } - amdgpu_sync_fence(&sync, bo_va->last_pt_update); + amdgpu_sync_fence(&sync, bo_va->last_pt_update, GFP_KERNEL); r = amdgpu_vm_update_pdes(adev, vm, false); if (r) { DRM_ERROR("failed to update pdes on meta data\n"); goto error_del_bo_va; } - amdgpu_sync_fence(&sync, vm->last_update); + amdgpu_sync_fence(&sync, vm->last_update, GFP_KERNEL); amdgpu_sync_wait(&sync, false); drm_exec_fini(&exec); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c index d75715b3f1870..34fc742fda91d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c @@ -152,7 +152,8 @@ static bool amdgpu_sync_add_later(struct amdgpu_sync *sync, struct dma_fence *f) * * Add the fence to the sync object. */ -int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f) +int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f, + gfp_t flags) { struct amdgpu_sync_entry *e; @@ -162,7 +163,7 @@ int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f) if (amdgpu_sync_add_later(sync, f)) return 0; - e = kmem_cache_alloc(amdgpu_sync_slab, GFP_KERNEL); + e = kmem_cache_alloc(amdgpu_sync_slab, flags); if (!e) return -ENOMEM; @@ -249,7 +250,7 @@ int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync, struct dma_fence *tmp = dma_fence_chain_contained(f); if (amdgpu_sync_test_fence(adev, mode, owner, tmp)) { - r = amdgpu_sync_fence(sync, f); + r = amdgpu_sync_fence(sync, f, GFP_KERNEL); dma_fence_put(f); if (r) return r; @@ -281,7 +282,7 @@ int amdgpu_sync_kfd(struct amdgpu_sync *sync, struct dma_resv *resv) if (fence_owner != AMDGPU_FENCE_OWNER_KFD) continue; - r = amdgpu_sync_fence(sync, f); + r = amdgpu_sync_fence(sync, f, GFP_KERNEL); if (r) break; } @@ -388,7 +389,7 @@ int amdgpu_sync_clone(struct amdgpu_sync *source, struct amdgpu_sync *clone) hash_for_each_safe(source->fences, i, tmp, e, node) { f = e->fence; if (!dma_fence_is_signaled(f)) { - r = amdgpu_sync_fence(clone, f); + r = amdgpu_sync_fence(clone, f, GFP_KERNEL); if (r) return r; } else { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h index a91a8eaf808b1..51eb4382c91eb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h @@ -47,7 +47,8 @@ struct amdgpu_sync { }; void amdgpu_sync_create(struct amdgpu_sync *sync); -int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f); +int amdgpu_sync_fence(struct amdgpu_sync *sync, struct dma_fence *f, + gfp_t flags); int amdgpu_sync_resv(struct amdgpu_device *adev, struct amdgpu_sync *sync, struct dma_resv *resv, enum amdgpu_sync_mode mode, void *owner); -- 2.39.5

9 months, 2 weeks

1
0
0 0

[PATCH AUTOSEL 6.14 060/642] drm/amdgpu: rework how isolation is enforced v2

by Sasha Levin

From: Christian König <christian.koenig(a)amd.com> [ Upstream commit bd22e44ad415ac22e3a4f9a983d2a085f6cb4427 ] Limiting the number of available VMIDs to enforce isolation causes some issues with gang submit and applying certain HW workarounds which require multiple VMIDs to work correctly. So instead start to track all submissions to the relevant engines in a per partition data structure and use the dma_fences of the submissions to enforce isolation similar to what a VMID limit does. v2: use ~0l for jobs without isolation to distinct it from kernel submissions which uses NULL for the owner. Add some warning when we are OOM. Signed-off-by: Christian König <christian.koenig(a)amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 13 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 98 +++++++++++++++++++++- drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c | 43 ++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 16 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c | 19 +++++ drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h | 1 + 6 files changed, 155 insertions(+), 35 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 98f0c12df12bc..9a61f5fe3245a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1187,9 +1187,15 @@ struct amdgpu_device { bool debug_enable_ras_aca; bool debug_exp_resets; - bool enforce_isolation[MAX_XCP]; - /* Added this mutex for cleaner shader isolation between GFX and compute processes */ + /* Protection for the following isolation structure */ struct mutex enforce_isolation_mutex; + bool enforce_isolation[MAX_XCP]; + struct amdgpu_isolation { + void *owner; + struct dma_fence *spearhead; + struct amdgpu_sync active; + struct amdgpu_sync prev; + } isolation[MAX_XCP]; struct amdgpu_init_level *init_lvl; }; @@ -1470,6 +1476,9 @@ void amdgpu_device_pcie_port_wreg(struct amdgpu_device *adev, struct dma_fence *amdgpu_device_get_gang(struct amdgpu_device *adev); struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev, struct dma_fence *gang); +struct dma_fence *amdgpu_device_enforce_isolation(struct amdgpu_device *adev, + struct amdgpu_ring *ring, + struct amdgpu_job *job); bool amdgpu_device_has_display_hardware(struct amdgpu_device *adev); ssize_t amdgpu_get_soft_full_reset_mask(struct amdgpu_ring *ring); ssize_t amdgpu_show_reset_mask(char *buf, uint32_t supported_reset); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 71e8a76180ad6..e298b48488c22 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4232,6 +4232,11 @@ int amdgpu_device_init(struct amdgpu_device *adev, mutex_init(&adev->gfx.reset_sem_mutex); /* Initialize the mutex for cleaner shader isolation between GFX and compute processes */ mutex_init(&adev->enforce_isolation_mutex); + for (i = 0; i < MAX_XCP; ++i) { + adev->isolation[i].spearhead = dma_fence_get_stub(); + amdgpu_sync_create(&adev->isolation[i].active); + amdgpu_sync_create(&adev->isolation[i].prev); + } mutex_init(&adev->gfx.kfd_sch_mutex); amdgpu_device_init_apu_flags(adev); @@ -4731,7 +4736,7 @@ void amdgpu_device_fini_hw(struct amdgpu_device *adev) void amdgpu_device_fini_sw(struct amdgpu_device *adev) { - int idx; + int i, idx; bool px; amdgpu_device_ip_fini(adev); @@ -4739,6 +4744,11 @@ void amdgpu_device_fini_sw(struct amdgpu_device *adev) amdgpu_ucode_release(&adev->firmware.gpu_info_fw); adev->accel_working = false; dma_fence_put(rcu_dereference_protected(adev->gang_submit, true)); + for (i = 0; i < MAX_XCP; ++i) { + dma_fence_put(adev->isolation[i].spearhead); + amdgpu_sync_free(&adev->isolation[i].active); + amdgpu_sync_free(&adev->isolation[i].prev); + } amdgpu_reset_fini(adev); @@ -6875,6 +6885,92 @@ struct dma_fence *amdgpu_device_switch_gang(struct amdgpu_device *adev, return NULL; } +/** + * amdgpu_device_enforce_isolation - enforce HW isolation + * @adev: the amdgpu device pointer + * @ring: the HW ring the job is supposed to run on + * @job: the job which is about to be pushed to the HW ring + * + * Makes sure that only one client at a time can use the GFX block. + * Returns: The dependency to wait on before the job can be pushed to the HW. + * The function is called multiple times until NULL is returned. + */ +struct dma_fence *amdgpu_device_enforce_isolation(struct amdgpu_device *adev, + struct amdgpu_ring *ring, + struct amdgpu_job *job) +{ + struct amdgpu_isolation *isolation = &adev->isolation[ring->xcp_id]; + struct drm_sched_fence *f = job->base.s_fence; + struct dma_fence *dep; + void *owner; + int r; + + /* + * For now enforce isolation only for the GFX block since we only need + * the cleaner shader on those rings. + */ + if (ring->funcs->type != AMDGPU_RING_TYPE_GFX && + ring->funcs->type != AMDGPU_RING_TYPE_COMPUTE) + return NULL; + + /* + * All submissions where enforce isolation is false are handled as if + * they come from a single client. Use ~0l as the owner to distinct it + * from kernel submissions where the owner is NULL. + */ + owner = job->enforce_isolation ? f->owner : (void *)~0l; + + mutex_lock(&adev->enforce_isolation_mutex); + + /* + * The "spearhead" submission is the first one which changes the + * ownership to its client. We always need to wait for it to be + * pushed to the HW before proceeding with anything. + */ + if (&f->scheduled != isolation->spearhead && + !dma_fence_is_signaled(isolation->spearhead)) { + dep = isolation->spearhead; + goto out_grab_ref; + } + + if (isolation->owner != owner) { + + /* + * Wait for any gang to be assembled before switching to a + * different owner or otherwise we could deadlock the + * submissions. + */ + if (!job->gang_submit) { + dep = amdgpu_device_get_gang(adev); + if (!dma_fence_is_signaled(dep)) + goto out_return_dep; + dma_fence_put(dep); + } + + dma_fence_put(isolation->spearhead); + isolation->spearhead = dma_fence_get(&f->scheduled); + amdgpu_sync_move(&isolation->active, &isolation->prev); + isolation->owner = owner; + } + + /* + * Specifying the ring here helps to pipeline submissions even when + * isolation is enabled. If that is not desired for testing NULL can be + * used instead of the ring to enforce a CPU round trip while switching + * between clients. + */ + dep = amdgpu_sync_peek_fence(&isolation->prev, ring); + r = amdgpu_sync_fence(&isolation->active, &f->finished, GFP_NOWAIT); + if (r) + DRM_WARN("OOM tracking isolation\n"); + +out_grab_ref: + dma_fence_get(dep); +out_return_dep: + mutex_unlock(&adev->enforce_isolation_mutex); + return dep; +} + bool amdgpu_device_has_display_hardware(struct amdgpu_device *adev) { switch (adev->asic_type) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c index 8e712a11aba5d..9008b7388e897 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ids.c @@ -287,40 +287,27 @@ static int amdgpu_vmid_grab_reserved(struct amdgpu_vm *vm, (*id)->flushed_updates < updates || !(*id)->last_flush || ((*id)->last_flush->context != fence_context && - !dma_fence_is_signaled((*id)->last_flush))) { + !dma_fence_is_signaled((*id)->last_flush))) + needs_flush = true; + + if ((*id)->owner != vm->immediate.fence_context || + (!adev->vm_manager.concurrent_flush && needs_flush)) { struct dma_fence *tmp; - /* Wait for the gang to be assembled before using a - * reserved VMID or otherwise the gang could deadlock. + /* Don't use per engine and per process VMID at the + * same time */ - tmp = amdgpu_device_get_gang(adev); - if (!dma_fence_is_signaled(tmp) && tmp != job->gang_submit) { + if (adev->vm_manager.concurrent_flush) + ring = NULL; + + /* to prevent one context starved by another context */ + (*id)->pd_gpu_addr = 0; + tmp = amdgpu_sync_peek_fence(&(*id)->active, ring); + if (tmp) { *id = NULL; - *fence = tmp; + *fence = dma_fence_get(tmp); return 0; } - dma_fence_put(tmp); - - /* Make sure the id is owned by the gang before proceeding */ - if (!job->gang_submit || - (*id)->owner != vm->immediate.fence_context) { - - /* Don't use per engine and per process VMID at the - * same time - */ - if (adev->vm_manager.concurrent_flush) - ring = NULL; - - /* to prevent one context starved by another context */ - (*id)->pd_gpu_addr = 0; - tmp = amdgpu_sync_peek_fence(&(*id)->active, ring); - if (tmp) { - *id = NULL; - *fence = dma_fence_get(tmp); - return 0; - } - } - needs_flush = true; } /* Good we can use this VMID. Remember this submission as diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index 100f044759435..685c61a05af85 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -342,17 +342,24 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job, { struct amdgpu_ring *ring = to_amdgpu_ring(s_entity->rq->sched); struct amdgpu_job *job = to_amdgpu_job(sched_job); - struct dma_fence *fence = NULL; + struct dma_fence *fence; int r; r = drm_sched_entity_error(s_entity); if (r) goto error; - if (job->gang_submit) + if (job->gang_submit) { fence = amdgpu_device_switch_gang(ring->adev, job->gang_submit); + if (fence) + return fence; + } + + fence = amdgpu_device_enforce_isolation(ring->adev, ring, job); + if (fence) + return fence; - if (!fence && job->vm && !job->vmid) { + if (job->vm && !job->vmid) { r = amdgpu_vmid_grab(job->vm, ring, job, &fence); if (r) { dev_err(ring->adev->dev, "Error getting VM ID (%d)\n", r); @@ -365,9 +372,10 @@ amdgpu_job_prepare_job(struct drm_sched_job *sched_job, */ if (!fence) job->vm = NULL; + return fence; } - return fence; + return NULL; error: dma_fence_set_error(&job->base.s_fence->finished, r); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c index c586ab4c911bf..d75715b3f1870 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c @@ -399,6 +399,25 @@ int amdgpu_sync_clone(struct amdgpu_sync *source, struct amdgpu_sync *clone) return 0; } +/** + * amdgpu_sync_move - move all fences from src to dst + * + * @src: source of the fences, empty after function + * @dst: destination for the fences + * + * Moves all fences from source to destination. All fences in destination are + * freed and source is empty after the function call. + */ +void amdgpu_sync_move(struct amdgpu_sync *src, struct amdgpu_sync *dst) +{ + unsigned int i; + + amdgpu_sync_free(dst); + + for (i = 0; i < HASH_SIZE(src->fences); ++i) + hlist_move_list(&src->fences[i], &dst->fences[i]); +} + /** * amdgpu_sync_push_to_job - push fences into job * @sync: sync object to get the fences from diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h index e3272dce798d7..a91a8eaf808b1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sync.h @@ -56,6 +56,7 @@ struct dma_fence *amdgpu_sync_peek_fence(struct amdgpu_sync *sync, struct amdgpu_ring *ring); struct dma_fence *amdgpu_sync_get_fence(struct amdgpu_sync *sync); int amdgpu_sync_clone(struct amdgpu_sync *source, struct amdgpu_sync *clone); +void amdgpu_sync_move(struct amdgpu_sync *src, struct amdgpu_sync *dst); int amdgpu_sync_push_to_job(struct amdgpu_sync *sync, struct amdgpu_job *job); int amdgpu_sync_wait(struct amdgpu_sync *sync, bool intr); void amdgpu_sync_free(struct amdgpu_sync *sync); -- 2.39.5

9 months, 2 weeks

1
0
0 0

[PATCH AUTOSEL 6.14 059/642] drm/amdgpu: rework how the cleaner shader is emitted v3

by Sasha Levin

From: Christian König <christian.koenig(a)amd.com> [ Upstream commit b7fbcd77bb467d09ba14cb4ec3b121dc85bb3100 ] Instead of emitting the cleaner shader for every job which has the enforce_isolation flag set only emit it for the first submission from every client. v2: add missing NULL check v3: fix another NULL pointer deref Signed-off-by: Christian König <christian.koenig(a)amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam(a)amd.com> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 27 ++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c index 22aa4a8f11891..f0d675c0fc69c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c @@ -754,6 +754,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool need_pipe_sync) { struct amdgpu_device *adev = ring->adev; + struct amdgpu_isolation *isolation = &adev->isolation[ring->xcp_id]; unsigned vmhub = ring->vm_hub; struct amdgpu_vmid_mgr *id_mgr = &adev->vm_manager.id_mgr[vmhub]; struct amdgpu_vmid *id = &id_mgr->ids[job->vmid]; @@ -761,8 +762,9 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool gds_switch_needed = ring->funcs->emit_gds_switch && job->gds_switch_needed; bool vm_flush_needed = job->vm_needs_flush; - struct dma_fence *fence = NULL; + bool cleaner_shader_needed = false; bool pasid_mapping_needed = false; + struct dma_fence *fence = NULL; unsigned int patch; int r; @@ -785,8 +787,12 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, pasid_mapping_needed &= adev->gmc.gmc_funcs->emit_pasid_mapping && ring->funcs->emit_wreg; + cleaner_shader_needed = adev->gfx.enable_cleaner_shader && + ring->funcs->emit_cleaner_shader && job->base.s_fence && + &job->base.s_fence->scheduled == isolation->spearhead; + if (!vm_flush_needed && !gds_switch_needed && !need_pipe_sync && - !(job->enforce_isolation && !job->vmid)) + !cleaner_shader_needed) return 0; amdgpu_ring_ib_begin(ring); @@ -797,9 +803,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, if (need_pipe_sync) amdgpu_ring_emit_pipeline_sync(ring); - if (adev->gfx.enable_cleaner_shader && - ring->funcs->emit_cleaner_shader && - job->enforce_isolation) + if (cleaner_shader_needed) ring->funcs->emit_cleaner_shader(ring); if (vm_flush_needed) { @@ -821,7 +825,7 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, job->oa_size); } - if (vm_flush_needed || pasid_mapping_needed) { + if (vm_flush_needed || pasid_mapping_needed || cleaner_shader_needed) { r = amdgpu_fence_emit(ring, &fence, NULL, 0); if (r) return r; @@ -843,6 +847,17 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, id->pasid_mapping = dma_fence_get(fence); mutex_unlock(&id_mgr->lock); } + + /* + * Make sure that all other submissions wait for the cleaner shader to + * finish before we push them to the HW. + */ + if (cleaner_shader_needed) { + mutex_lock(&adev->enforce_isolation_mutex); + dma_fence_put(isolation->spearhead); + isolation->spearhead = dma_fence_get(fence); + mutex_unlock(&adev->enforce_isolation_mutex); + } dma_fence_put(fence); amdgpu_ring_patch_cond_exec(ring, patch); -- 2.39.5

9 months, 2 weeks

1
0
0 0

[PATCH v2 0/6] Replace CONFIG_DMABUF_SYSFS_STATS with BPF

by T.J. Mercier

Until CONFIG_DMABUF_SYSFS_STATS was added [1] it was only possible to perform per-buffer accounting with debugfs which is not suitable for production environments. Eventually we discovered the overhead with per-buffer sysfs file creation/removal was significantly impacting allocation and free times, and exacerbated kernfs lock contention. [2] dma_buf_stats_setup() is responsible for 39% of single-page buffer creation duration, or 74% of single-page dma_buf_export() duration when stressing dmabuf allocations and frees. I prototyped a change from per-buffer to per-exporter statistics with a RCU protected list of exporter allocations that accommodates most (but not all) of our use-cases and avoids almost all of the sysfs overhead. While that adds less overhead than per-buffer sysfs, and less even than the maintenance of the dmabuf debugfs_list, it's still *additional* overhead on top of the debugfs_list and doesn't give us per-buffer info. This series uses the existing dmabuf debugfs_list to implement a BPF dmabuf iterator, which adds no overhead to buffer allocation/free and provides per-buffer info. The list has been moved outside of CONFIG_DEBUG_FS scope so that it is always populated. The BPF program loaded by userspace that extracts per-buffer information gets to define its own interface which avoids the lack of ABI stability with debugfs. As this is a replacement for our use of CONFIG_DMABUF_SYSFS_STATS, the last patch is a RFC for removing it from the kernel. Please see my suggestion there regarding the timeline for that. [1] https://lore.kernel.org/linux-media/20201210044400.1080308-1-hridya@google.… [2] https://lore.kernel.org/all/20220516171315.2400578-1-tjmercier@google.com v1: https://lore.kernel.org/all/20250414225227.3642618-1-tjmercier@google.com v1 -> v2: Make the DMA buffer list independent of CONFIG_DEBUG_FS per Christian König Add CONFIG_DMA_SHARED_BUFFER check to kernel/bpf/Makefile per kernel test robot Use BTF_ID_LIST_SINGLE instead of BTF_ID_LIST_GLOBAL_SINGLE per Song Liu Fixup comment style, mixing code/declarations, and use ASSERT_OK_FD in selftest per Song Liu Add BPF_ITER_RESCHED feature to bpf_dmabuf_reg_info per Alexei Starovoitov Add open-coded iterator and selftest per Alexei Starovoitov Add a second test buffer from the system dmabuf heap to selftests Use the BPF program we'll use in production for selftest per Alexei Starovoitov https://r.android.com/c/platform/system/bpfprogs/+/3616123/2/dmabufIter.c https://r.android.com/c/platform/system/memory/libmeminfo/+/3614259/1/libdm… T.J. Mercier (6): dma-buf: Rename and expose debugfs symbols bpf: Add dmabuf iterator bpf: Add open coded dmabuf iterator selftests/bpf: Add test for dmabuf_iter selftests/bpf: Add test for open coded dmabuf_iter RFC: dma-buf: Remove DMA-BUF statistics .../ABI/testing/sysfs-kernel-dmabuf-buffers | 24 -- Documentation/driver-api/dma-buf.rst | 5 - drivers/dma-buf/Kconfig | 15 - drivers/dma-buf/Makefile | 1 - drivers/dma-buf/dma-buf-sysfs-stats.c | 202 -------------- drivers/dma-buf/dma-buf-sysfs-stats.h | 35 --- drivers/dma-buf/dma-buf.c | 58 +--- include/linux/dma-buf.h | 6 +- kernel/bpf/Makefile | 3 + kernel/bpf/dmabuf_iter.c | 177 ++++++++++++ kernel/bpf/helpers.c | 5 + .../testing/selftests/bpf/bpf_experimental.h | 5 + tools/testing/selftests/bpf/config | 3 + .../selftests/bpf/prog_tests/dmabuf_iter.c | 258 ++++++++++++++++++ .../testing/selftests/bpf/progs/dmabuf_iter.c | 91 ++++++ 15 files changed, 561 insertions(+), 327 deletions(-) delete mode 100644 Documentation/ABI/testing/sysfs-kernel-dmabuf-buffers delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.c delete mode 100644 drivers/dma-buf/dma-buf-sysfs-stats.h create mode 100644 kernel/bpf/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/prog_tests/dmabuf_iter.c create mode 100644 tools/testing/selftests/bpf/progs/dmabuf_iter.c base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 -- 2.49.0.906.g1f30a19c02-goog

9 months, 2 weeks

2
12
0 0

[PATCH v4 00/33] drm/msm: sparse / "VM_BIND" support

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. A new VM_BIND ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue I did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel. Synchronous/immediate VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue. The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533 Changes in v4: - Replace selftests_running flag with IO_PGTABLE_QUIRK_NO_WARN_ON [Robin Murphy] - Rework msm_gem_vm_sm_step_remap() for cases that orig_vma is evicted to solve some crashes - Block when drm_file is closed until pending VM_BIND ops complete, before tearing down the VM's scheduler, to solve some memory leaks. - Link to v3: https://lore.kernel.org/all/20250428205619.227835-1-robdclark@gmail.com/ Changes in v3: - Switched to separate VM_BIND ioctl. This makes the UABI a bit cleaner, but OTOH the userspace code was cleaner when the end result of either type of VkQueue lead to the same ioctl. So I'm a bit on the fence. - Switched to doing the gpuvm bookkeeping synchronously, and only deferring the pgtable updates. This avoids needing to hold any resv locks in the fence signaling path, resolving the last shrinker related lockdep complaints. OTOH it means userspace can trigger invalid pgtable updates with multiple VM_BIND queues. In this case, we ensure that unmaps happen completely (to prevent userspace from using this to access free'd pages), mark the context as unusable, and move on with life. - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/ Changes in v2: - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been merged. - Pre-allocate all the things, and drop HACK patch which disabled shrinker. This includes ensuring that vm_bo objects are allocated up front, pre- allocating VMA objects, and pre-allocating pages used for pgtable updates. The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that were initially added for panthor. - Add back support for BO dumping for devcoredump. - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c… [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 Rob Clark (33): drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Allow VAs to hold soft reference to BOs iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: Don't close VMAs on purge drm/msm: drm_gpuvm conversion drm/msm: Convert vm locking drm/msm: Use drm_gpuvm types more drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on GPU hangs drm/msm: Add _NO_SHARE flag drm/msm: Crashdump prep for sparse mappings drm/msm: rd dumping prep for sparse mappings drm/msm: Crashdec support for sparse drm/msm: rd dumping support for sparse drm/msm: Extract out syncobj helpers drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL drm/msm: Add VM_BIND submitqueue drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON drm/msm: Support pgtable preallocation drm/msm: Split out map/unmap ops drm/msm: Add VM_BIND ioctl drm/msm: Bump UAPI version drivers/gpu/drm/drm_gpuvm.c | 15 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 - drivers/gpu/drm/msm/adreno/adreno_gpu.c | 88 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_drv.c | 183 +-- drivers/gpu/drm/msm/msm_drv.h | 35 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 489 +++---- drivers/gpu/drm/msm/msm_gem.h | 217 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_shrinker.c | 4 +- drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++-- drivers/gpu/drm/msm/msm_gem_vma.c | 1288 +++++++++++++++-- drivers/gpu/drm/msm/msm_gpu.c | 171 ++- drivers/gpu/drm/msm/msm_gpu.h | 132 +- drivers/gpu/drm/msm/msm_iommu.c | 298 +++- drivers/gpu/drm/msm/msm_kms.c | 18 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 38 +- drivers/gpu/drm/msm/msm_rd.c | 62 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/gpu/drm/msm/msm_submitqueue.c | 96 +- drivers/gpu/drm/msm/msm_syncobj.c | 172 +++ drivers/gpu/drm/msm/msm_syncobj.h | 37 + drivers/iommu/io-pgtable-arm.c | 27 +- include/drm/drm_gpuvm.h | 12 +- include/linux/io-pgtable.h | 8 + include/uapi/drm/msm_drm.h | 149 +- 57 files changed, 3043 insertions(+), 1227 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h -- 2.49.0

9 months, 2 weeks

2
4
0 0

Re: [PATCH RESEND] drm/msm: fix a potential memory leak issue in submit_create()

by Rob Clark

On Wed, Apr 23, 2025 at 8:28 PM Haoxiang Li <haoxiang_li2024(a)163.com> wrote: > > The memory allocated by msm_fence_alloc() actually is the > container of msm_fence_alloc()'s return value. Thus, just > free its return value is not enough. > Add a helper 'msm_fence_free()' in msm_fence.h/msm_fence.c > to do the complete job. > > Fixes: f94e6a51e17c ("drm/msm: Pre-allocate hw_fence") > Cc: stable(a)vger.kernel.org > Signed-off-by: Haoxiang Li <haoxiang_li2024(a)163.com> > --- > drivers/gpu/drm/msm/msm_fence.c | 7 +++++++ > drivers/gpu/drm/msm/msm_fence.h | 1 + > drivers/gpu/drm/msm/msm_gem_submit.c | 2 +- > 3 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/msm/msm_fence.c b/drivers/gpu/drm/msm/msm_fence.c > index d41e5a6bbee0..72641e6a627d 100644 > --- a/drivers/gpu/drm/msm/msm_fence.c > +++ b/drivers/gpu/drm/msm/msm_fence.c > @@ -183,6 +183,13 @@ msm_fence_alloc(void) > return &f->base; > } > > +void msm_fence_free(struct dma_fence *fence) > +{ > + struct msm_fence *f = to_msm_fence(fence); > + > + kfree(f); > +} > + > void > msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx) > { > diff --git a/drivers/gpu/drm/msm/msm_fence.h b/drivers/gpu/drm/msm/msm_fence.h > index 148196375a0b..635c68629070 100644 > --- a/drivers/gpu/drm/msm/msm_fence.h > +++ b/drivers/gpu/drm/msm/msm_fence.h > @@ -82,6 +82,7 @@ bool msm_fence_completed(struct msm_fence_context *fctx, uint32_t fence); > void msm_update_fence(struct msm_fence_context *fctx, uint32_t fence); > > struct dma_fence * msm_fence_alloc(void); > +void msm_fence_free(struct dma_fence *fence); > void msm_fence_init(struct dma_fence *fence, struct msm_fence_context *fctx); > > static inline bool > diff --git a/drivers/gpu/drm/msm/msm_gem_submit.c b/drivers/gpu/drm/msm/msm_gem_submit.c > index 3e9aa2cc38ef..213baa5bca5e 100644 > --- a/drivers/gpu/drm/msm/msm_gem_submit.c > +++ b/drivers/gpu/drm/msm/msm_gem_submit.c > @@ -56,7 +56,7 @@ static struct msm_gem_submit *submit_create(struct drm_device *dev, > > ret = drm_sched_job_init(&submit->base, queue->entity, 1, queue); > if (ret) { > - kfree(submit->hw_fence); > + msm_fence_free(submit->hw_fence); `struct dma_fence base` is the first field in `struct msm_fence`, so to_msm_fence() is just a pointer cast. Ie. it is fine to pass it to kfree() as-is BR, -R > kfree(submit); > return ERR_PTR(ret); > } > -- > 2.25.1 >

9 months, 2 weeks

1
0
0 0

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by kernel test robot

Hi, kernel test robot noticed the following build warnings: [auto build test WARNING on jic23-iio/togreg] [also build test WARNING on char-misc/char-misc-testing char-misc/char-misc-next char-misc/char-misc-linus usb/usb-testing usb/usb-next usb/usb-linus xen-tip/linux-next linus/master v6.15-rc4] [cannot apply to tegra/for-next drm-xe/drm-xe-next rmk-arm/drm-armada-devel rmk-arm/drm-armada-fixes next-20250501] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/oushixiong1025-163-com/drm-p… base: https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git togreg patch link: https://lore.kernel.org/r/20250430085658.540746-2-oushixiong1025%40163.com patch subject: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table config: arc-randconfig-002-20250501 (https://download.01.org/0day-ci/archive/20250502/202505022224.FCDQ8TCB-lkp@…) compiler: arc-linux-gcc (GCC) 14.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250502/202505022224.FCDQ8TCB-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505022224.FCDQ8TCB-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/gpu/drm/drm_prime.c:925:24: warning: no previous prototype for 'drm_gem_prime_import_dev_skip_map' [-Wmissing-prototypes] 925 | struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/drm_gem_prime_import_dev_skip_map +925 drivers/gpu/drm/drm_prime.c 913 914 /** 915 * drm_gem_prime_import_dev_skip_map - core implementation of the import callback 916 * @dev: drm_device to import into 917 * @dma_buf: dma-buf object to import 918 * @attach_dev: struct device to dma_buf attach 919 * 920 * This function exports a dma-buf without get it's scatter/gather table. 921 * 922 * Drivers who need to get an scatter/gather table for objects need to call 923 * drm_gem_prime_import_dev() instead. 924 */ > 925 struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, 926 struct dma_buf *dma_buf, 927 struct device *attach_dev) 928 { 929 struct dma_buf_attachment *attach; 930 struct drm_gem_object *obj; 931 int ret; 932 933 if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { 934 obj = dma_buf->priv; 935 if (obj->dev == dev) { 936 /* 937 * Importing dmabuf exported from our own gem increases 938 * refcount on gem itself instead of f_count of dmabuf. 939 */ 940 drm_gem_object_get(obj); 941 return obj; 942 } 943 } 944 945 attach = dma_buf_attach(dma_buf, attach_dev, true); 946 if (IS_ERR(attach)) 947 return ERR_CAST(attach); 948 949 get_dma_buf(dma_buf); 950 951 obj = dev->driver->gem_prime_import_attachment(dev, attach); 952 if (IS_ERR(obj)) { 953 ret = PTR_ERR(obj); 954 goto fail_detach; 955 } 956 957 obj->import_attach = attach; 958 obj->resv = dma_buf->resv; 959 960 return obj; 961 962 fail_detach: 963 dma_buf_detach(dma_buf, attach); 964 dma_buf_put(dma_buf); 965 966 return ERR_PTR(ret); 967 } 968 EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); 969 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki

9 months, 2 weeks

1
0
0 0

Re: [PATCH 1/3] dma-buf: add flags to skip map_dma_buf() for some drivers

by kernel test robot

Hi, kernel test robot noticed the following build warnings: [auto build test WARNING on jic23-iio/togreg] [also build test WARNING on char-misc/char-misc-testing char-misc/char-misc-next char-misc/char-misc-linus usb/usb-testing usb/usb-next usb/usb-linus xen-tip/linux-next linus/master v6.15-rc4] [cannot apply to tegra/for-next drm-xe/drm-xe-next rmk-arm/drm-armada-devel rmk-arm/drm-armada-fixes next-20250430] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/oushixiong1025-163-com/drm-p… base: https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git togreg patch link: https://lore.kernel.org/r/20250430085658.540746-1-oushixiong1025%40163.com patch subject: [PATCH 1/3] dma-buf: add flags to skip map_dma_buf() for some drivers config: arc-randconfig-002-20250501 (https://download.01.org/0day-ci/archive/20250502/202505020434.7EfUIAjh-lkp@…) compiler: arc-linux-gcc (GCC) 14.2.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250502/202505020434.7EfUIAjh-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505020434.7EfUIAjh-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/dma-buf/dma-buf.c:908: warning: Function parameter or struct member 'skip_map' not described in 'dma_buf_dynamic_attach' >> drivers/dma-buf/dma-buf.c:996: warning: Function parameter or struct member 'skip_map' not described in 'dma_buf_attach' vim +908 drivers/dma-buf/dma-buf.c 84335675f2223c drivers/dma-buf/dma-buf.c Simona Vetter 2021-01-15 817 ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 818 /** ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 819 * DOC: locking convention ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 820 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 821 * In order to avoid deadlock situations between dma-buf exports and importers, ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 822 * all dma-buf API users must follow the common dma-buf locking convention. ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 823 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 824 * Convention for importers ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 825 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 826 * 1. Importers must hold the dma-buf reservation lock when calling these ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 827 * functions: ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 828 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 829 * - dma_buf_pin() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 830 * - dma_buf_unpin() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 831 * - dma_buf_map_attachment() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 832 * - dma_buf_unmap_attachment() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 833 * - dma_buf_vmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 834 * - dma_buf_vunmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 835 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 836 * 2. Importers must not hold the dma-buf reservation lock when calling these ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 837 * functions: ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 838 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 839 * - dma_buf_attach() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 840 * - dma_buf_dynamic_attach() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 841 * - dma_buf_detach() e3ecbd21776f1f drivers/dma-buf/dma-buf.c Maíra Canal 2023-02-23 842 * - dma_buf_export() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 843 * - dma_buf_fd() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 844 * - dma_buf_get() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 845 * - dma_buf_put() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 846 * - dma_buf_mmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 847 * - dma_buf_begin_cpu_access() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 848 * - dma_buf_end_cpu_access() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 849 * - dma_buf_map_attachment_unlocked() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 850 * - dma_buf_unmap_attachment_unlocked() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 851 * - dma_buf_vmap_unlocked() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 852 * - dma_buf_vunmap_unlocked() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 853 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 854 * Convention for exporters ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 855 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 856 * 1. These &dma_buf_ops callbacks are invoked with unlocked dma-buf ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 857 * reservation and exporter can take the lock: ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 858 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 859 * - &dma_buf_ops.attach() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 860 * - &dma_buf_ops.detach() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 861 * - &dma_buf_ops.release() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 862 * - &dma_buf_ops.begin_cpu_access() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 863 * - &dma_buf_ops.end_cpu_access() 8021fa16b7ec0a drivers/dma-buf/dma-buf.c Dmitry Osipenko 2023-05-30 864 * - &dma_buf_ops.mmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 865 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 866 * 2. These &dma_buf_ops callbacks are invoked with locked dma-buf ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 867 * reservation and exporter can't take the lock: ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 868 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 869 * - &dma_buf_ops.pin() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 870 * - &dma_buf_ops.unpin() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 871 * - &dma_buf_ops.map_dma_buf() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 872 * - &dma_buf_ops.unmap_dma_buf() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 873 * - &dma_buf_ops.vmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 874 * - &dma_buf_ops.vunmap() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 875 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 876 * 3. Exporters must hold the dma-buf reservation lock when calling these ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 877 * functions: ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 878 * ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 879 * - dma_buf_move_notify() ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 880 */ ae2e7f28a170c0 drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 881 d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 882 /** 85804b70cca68d drivers/dma-buf/dma-buf.c Simona Vetter 2020-12-11 883 * dma_buf_dynamic_attach - Add the device to dma_buf's attachments list d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 884 * @dmabuf: [in] buffer to attach device to. d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 885 * @dev: [in] device to be attached. 6f49c2515e2258 drivers/dma-buf/dma-buf.c Randy Dunlap 2020-04-07 886 * @importer_ops: [in] importer operations for the attachment 6f49c2515e2258 drivers/dma-buf/dma-buf.c Randy Dunlap 2020-04-07 887 * @importer_priv: [in] importer private pointer for the attachment d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 888 * 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 889 * Returns struct dma_buf_attachment pointer for this attachment. Attachments 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 890 * must be cleaned up by calling dma_buf_detach(). 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 891 * 85804b70cca68d drivers/dma-buf/dma-buf.c Simona Vetter 2020-12-11 892 * Optionally this calls &dma_buf_ops.attach to allow device-specific attach 85804b70cca68d drivers/dma-buf/dma-buf.c Simona Vetter 2020-12-11 893 * functionality. 85804b70cca68d drivers/dma-buf/dma-buf.c Simona Vetter 2020-12-11 894 * 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 895 * Returns: 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 896 * 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 897 * A pointer to newly created &dma_buf_attachment on success, or a negative 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 898 * error code wrapped into a pointer on failure. 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 899 * 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 900 * Note that this can fail if the backing storage of @dmabuf is in a place not 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 901 * accessible to @dev, and cannot be moved to a more suitable place. This is 2904a8c1311f02 drivers/dma-buf/dma-buf.c Simona Vetter 2016-12-09 902 * indicated with the error code -EBUSY. d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 903 */ 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 904 struct dma_buf_attachment * 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 905 dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 906 const struct dma_buf_attach_ops *importer_ops, 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 907 void *importer_priv, bool skip_map) d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 @908 { d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 909 struct dma_buf_attachment *attach; d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 910 int ret; d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 911 d1aa06a1eaf5f7 drivers/base/dma-buf.c Laurent Pinchart 2012-01-26 912 if (WARN_ON(!dmabuf || !dev)) d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 913 return ERR_PTR(-EINVAL); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 914 4981cdb063e3e9 drivers/dma-buf/dma-buf.c Christian König 2020-02-19 915 if (WARN_ON(importer_ops && !importer_ops->move_notify)) 4981cdb063e3e9 drivers/dma-buf/dma-buf.c Christian König 2020-02-19 916 return ERR_PTR(-EINVAL); 4981cdb063e3e9 drivers/dma-buf/dma-buf.c Christian König 2020-02-19 917 db7942b6292306 drivers/dma-buf/dma-buf.c Markus Elfring 2017-05-08 918 attach = kzalloc(sizeof(*attach), GFP_KERNEL); 34d84ec4881d13 drivers/dma-buf/dma-buf.c Markus Elfring 2017-05-08 919 if (!attach) a9fbc3b73127ef drivers/base/dma-buf.c Laurent Pinchart 2012-01-26 920 return ERR_PTR(-ENOMEM); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 921 d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 922 attach->dev = dev; d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 923 attach->dmabuf = dmabuf; 09606b5446c25b drivers/dma-buf/dma-buf.c Christian König 2018-03-22 924 if (importer_ops) 09606b5446c25b drivers/dma-buf/dma-buf.c Christian König 2018-03-22 925 attach->peer2peer = importer_ops->allow_peer2peer; bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 926 attach->importer_ops = importer_ops; bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 927 attach->importer_priv = importer_priv; 2ed9201bdd9a8e drivers/base/dma-buf.c Laurent Pinchart 2012-01-26 928 d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 929 if (dmabuf->ops->attach) { a19741e5e5a9f1 drivers/dma-buf/dma-buf.c Christian König 2018-05-28 930 ret = dmabuf->ops->attach(dmabuf, attach); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 931 if (ret) d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 932 goto err_attach; d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 933 } 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 934 dma_resv_lock(dmabuf->resv, NULL); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 935 list_add(&attach->node, &dmabuf->attachments); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 936 dma_resv_unlock(dmabuf->resv); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 937 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 938 /* When either the importer or the exporter can't handle dynamic 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 939 * mappings we cache the mapping here to avoid issues with the 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 940 * reservation object lock. 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 941 */ 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 942 if (dma_buf_attachment_is_dynamic(attach) != 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 943 dma_buf_is_dynamic(dmabuf)) { 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 944 dma_resv_lock(attach->dmabuf->resv, NULL); 809d9c72c2f83e drivers/dma-buf/dma-buf.c Dmitry Osipenko 2022-10-17 945 if (dma_buf_is_dynamic(attach->dmabuf)) { 7e008b02557cce drivers/dma-buf/dma-buf.c Christian König 2021-05-17 946 ret = dmabuf->ops->pin(attach); bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 947 if (ret) bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 948 goto err_unlock; bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 949 } 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 950 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 951 if (!skip_map) { 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 952 struct sg_table *sgt; 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 953 84335675f2223c drivers/dma-buf/dma-buf.c Simona Vetter 2021-01-15 954 sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 955 if (!sgt) 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 956 sgt = ERR_PTR(-ENOMEM); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 957 if (IS_ERR(sgt)) { 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 958 ret = PTR_ERR(sgt); bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 959 goto err_unpin; 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 960 } 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 961 attach->sgt = sgt; 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 962 attach->dir = DMA_BIDIRECTIONAL; 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 963 } 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 964 dma_resv_unlock(attach->dmabuf->resv); 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 965 } 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 966 d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 967 return attach; d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 968 d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 969 err_attach: d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 970 kfree(attach); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 971 return ERR_PTR(ret); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 972 bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 973 err_unpin: bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 974 if (dma_buf_is_dynamic(attach->dmabuf)) 7e008b02557cce drivers/dma-buf/dma-buf.c Christian König 2021-05-17 975 dmabuf->ops->unpin(attach); bb42df4662a447 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 976 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 977 err_unlock: 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 978 dma_resv_unlock(attach->dmabuf->resv); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 979 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 980 dma_buf_detach(dmabuf, attach); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 981 return ERR_PTR(ret); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 982 } cdd30ebb1b9f36 drivers/dma-buf/dma-buf.c Peter Zijlstra 2024-12-02 983 EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 984 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 985 /** 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 986 * dma_buf_attach - Wrapper for dma_buf_dynamic_attach 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 987 * @dmabuf: [in] buffer to attach device to. 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 988 * @dev: [in] device to be attached. 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 989 * 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 990 * Wrapper to call dma_buf_dynamic_attach() for drivers which still use a static 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 991 * mapping. 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 992 */ 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 993 struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 994 struct device *dev, 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 995 bool skip_map) 15fd552d186cb0 drivers/dma-buf/dma-buf.c Christian König 2018-07-03 @996 { 8935ae05eee351 drivers/dma-buf/dma-buf.c Shixiong Ou 2025-04-30 997 return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL, skip_map); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 998 } cdd30ebb1b9f36 drivers/dma-buf/dma-buf.c Peter Zijlstra 2024-12-02 999 EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); d15bd7ee445d07 drivers/base/dma-buf.c Sumit Semwal 2011-12-26 1000 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki

9 months, 2 weeks

1
0
0 0

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by kernel test robot

Hi, kernel test robot noticed the following build warnings: [auto build test WARNING on jic23-iio/togreg] [also build test WARNING on char-misc/char-misc-testing char-misc/char-misc-next char-misc/char-misc-linus usb/usb-testing usb/usb-next usb/usb-linus xen-tip/linux-next linus/master v6.15-rc4] [cannot apply to tegra/for-next drm-xe/drm-xe-next rmk-arm/drm-armada-devel rmk-arm/drm-armada-fixes next-20250430] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/oushixiong1025-163-com/drm-p… base: https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio.git togreg patch link: https://lore.kernel.org/r/20250430085658.540746-2-oushixiong1025%40163.com patch subject: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table config: arm64-randconfig-003-20250501 (https://download.01.org/0day-ci/archive/20250501/202505011655.qTmh4UA7-lkp@…) compiler: clang version 21.0.0git (https://github.com/llvm/llvm-project f819f46284f2a79790038e1f6649172789734ae8) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250501/202505011655.qTmh4UA7-lkp@…) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp(a)intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202505011655.qTmh4UA7-lkp@intel.com/ All warnings (new ones prefixed by >>): >> drivers/gpu/drm/drm_prime.c:925:24: warning: no previous prototype for function 'drm_gem_prime_import_dev_skip_map' [-Wmissing-prototypes] 925 | struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, | ^ drivers/gpu/drm/drm_prime.c:925:1: note: declare 'static' if the function is not intended to be used outside of this translation unit 925 | struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, | ^ | static 1 warning generated. vim +/drm_gem_prime_import_dev_skip_map +925 drivers/gpu/drm/drm_prime.c 913 914 /** 915 * drm_gem_prime_import_dev_skip_map - core implementation of the import callback 916 * @dev: drm_device to import into 917 * @dma_buf: dma-buf object to import 918 * @attach_dev: struct device to dma_buf attach 919 * 920 * This function exports a dma-buf without get it's scatter/gather table. 921 * 922 * Drivers who need to get an scatter/gather table for objects need to call 923 * drm_gem_prime_import_dev() instead. 924 */ > 925 struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, 926 struct dma_buf *dma_buf, 927 struct device *attach_dev) 928 { 929 struct dma_buf_attachment *attach; 930 struct drm_gem_object *obj; 931 int ret; 932 933 if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { 934 obj = dma_buf->priv; 935 if (obj->dev == dev) { 936 /* 937 * Importing dmabuf exported from our own gem increases 938 * refcount on gem itself instead of f_count of dmabuf. 939 */ 940 drm_gem_object_get(obj); 941 return obj; 942 } 943 } 944 945 attach = dma_buf_attach(dma_buf, attach_dev, true); 946 if (IS_ERR(attach)) 947 return ERR_CAST(attach); 948 949 get_dma_buf(dma_buf); 950 951 obj = dev->driver->gem_prime_import_attachment(dev, attach); 952 if (IS_ERR(obj)) { 953 ret = PTR_ERR(obj); 954 goto fail_detach; 955 } 956 957 obj->import_attach = attach; 958 obj->resv = dma_buf->resv; 959 960 return obj; 961 962 fail_detach: 963 dma_buf_detach(dma_buf, attach); 964 dma_buf_put(dma_buf); 965 966 return ERR_PTR(ret); 967 } 968 EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); 969 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki

9 months, 2 weeks

1
0
0 0

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by Christian König

On 4/30/25 16:13, oushixiong wrote: > > 在 2025/4/30 19:03, Christian König 写道: >> On 4/30/25 10:56,oushixiong1025@163.com wrote: >>> From: Shixiong Ou<oushixiong(a)kylinos.cn> >>> >>> [WHY] >>> On some boards, the dma_mask of Aspeed devices is 0xffff_ffff, this >>> quite possibly causes the SWIOTLB to be triggered when importing dmabuf. >>> However IO_TLB_SEGSIZE limits the maximum amount of available memory >>> for DMA Streaming Mapping, as dmesg following: >>> >>> [ 24.885303][ T1947] ast 0000:07:00.0: swiotlb buffer is full (sz: 3145728 bytes), total 32768 (slots), used 0 (slots) >>> >>> [HOW] Provide an interface so that attachment is not mapped when >>> importing dma-buf. >> This is unecessary. The extra abstraction in DRM is only useful when you want to implement the obj->funcs->get_sg_table() callback. >> >> When a driver doesn't want to expose an sg_table for a buffer or want some other special handling it can simply do so by implementing the DMA-buf interface directly. >> >> See drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c for an example on how to do this. >> >> Regards, >> Christian. > > > Thanks for the reminder, > > most drivers that use DRM_GEM_SHADOW_PLANE_HELPER_FUNCSand DRM_GEM_SHMEM_DRIVER_OPS > > don't need to import the sg_table, such as the udl and the ast and so on at the moment. > > They just need to call dma_buf_vmap() to get the kernel virtual address of the shared buffer. > > So I wondered if there was a simple generic PRIME implementation for these drivers. > > If you don't recommend this, Maybe try to implement it in DRM_GEM_SHMEM_DRIVER_OPS ? Well if you only want to implement vmap/vunmap the necessary code in the driver would look something like this: const struct dma_buf_ops amdgpu_dmabuf_ops = { .map_dma_buf = dummy_map_function, .release = drm_gem_dmabuf_release, .mmap = drm_gem_dmabuf_mmap, .vmap = drm_gem_dmabuf_vmap, .vunmap = drm_gem_dmabuf_vunmap, }; struct dma_buf *drv_gem_prime_export(struct drm_gem_object *gobj, int flags) { struct dma_buf *buf; buf = drm_gem_prime_export(gobj, flags); if (!IS_ERR(buf)) buf->ops = &amdgpu_dmabuf_ops; return buf; } The only thing which could be improved is the dummy_map_function. As far as I can see we could make the map function optional in DMA-buf now. Apart from that you could make a DRM helper from that few lines, but to be honest I don't think it's worth it. It reduces the loc a bit, but there is no real complexity here which drivers could share. Regards, Christian. > > Regards, > > Shixiong Ou. > >>> Signed-off-by: Shixiong Ou<oushixiong(a)kylinos.cn> >>> --- >>> drivers/gpu/drm/ast/ast_drv.c | 2 +- >>> drivers/gpu/drm/drm_gem_shmem_helper.c | 17 +++++++ >>> drivers/gpu/drm/drm_prime.c | 67 ++++++++++++++++++++++++-- >>> drivers/gpu/drm/udl/udl_drv.c | 2 +- >>> include/drm/drm_drv.h | 3 ++ >>> include/drm/drm_gem_shmem_helper.h | 6 +++ >>> 6 files changed, 91 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c >>> index 6fbf62a99c48..2dac6acf79e7 100644 >>> --- a/drivers/gpu/drm/ast/ast_drv.c >>> +++ b/drivers/gpu/drm/ast/ast_drv.c >>> @@ -64,7 +64,7 @@ static const struct drm_driver ast_driver = { >>> .minor = DRIVER_MINOR, >>> .patchlevel = DRIVER_PATCHLEVEL, >>> - DRM_GEM_SHMEM_DRIVER_OPS, >>> + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, >>> DRM_FBDEV_SHMEM_DRIVER_OPS, >>> }; >>> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> index d99dee67353a..655d841df933 100644 >>> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c >>> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c >>> @@ -799,6 +799,23 @@ drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, >>> } >>> EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table); >>> +struct drm_gem_object * >>> +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, >>> + struct dma_buf_attachment *attach) >>> +{ >>> + size_t size = PAGE_ALIGN(attach->dmabuf->size); >>> + struct drm_gem_shmem_object *shmem; >>> + >>> + shmem = __drm_gem_shmem_create(dev, size, true, NULL); >>> + if (IS_ERR(shmem)) >>> + return ERR_CAST(shmem); >>> + >>> + drm_dbg_prime(dev, "size = %zu\n", size); >>> + >>> + return &shmem->base; >>> +} >>> +EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_attachment); >>> + >>> MODULE_DESCRIPTION("DRM SHMEM memory-management helpers"); >>> MODULE_IMPORT_NS("DMA_BUF"); >>> MODULE_LICENSE("GPL v2"); >>> diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c >>> index 8e70abca33b9..522cf974e202 100644 >>> --- a/drivers/gpu/drm/drm_prime.c >>> +++ b/drivers/gpu/drm/drm_prime.c >>> @@ -911,6 +911,62 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, >>> } >>> EXPORT_SYMBOL(drm_gem_prime_export); >>> +/** >>> + * drm_gem_prime_import_dev_skip_map - core implementation of the import callback >>> + * @dev: drm_device to import into >>> + * @dma_buf: dma-buf object to import >>> + * @attach_dev: struct device to dma_buf attach >>> + * >>> + * This function exports a dma-buf without get it's scatter/gather table. >>> + * >>> + * Drivers who need to get an scatter/gather table for objects need to call >>> + * drm_gem_prime_import_dev() instead. >>> + */ >>> +struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, >>> + struct dma_buf *dma_buf, >>> + struct device *attach_dev) >>> +{ >>> + struct dma_buf_attachment *attach; >>> + struct drm_gem_object *obj; >>> + int ret; >>> + >>> + if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { >>> + obj = dma_buf->priv; >>> + if (obj->dev == dev) { >>> + /* >>> + * Importing dmabuf exported from our own gem increases >>> + * refcount on gem itself instead of f_count of dmabuf. >>> + */ >>> + drm_gem_object_get(obj); >>> + return obj; >>> + } >>> + } >>> + >>> + attach = dma_buf_attach(dma_buf, attach_dev, true); >>> + if (IS_ERR(attach)) >>> + return ERR_CAST(attach); >>> + >>> + get_dma_buf(dma_buf); >>> + >>> + obj = dev->driver->gem_prime_import_attachment(dev, attach); >>> + if (IS_ERR(obj)) { >>> + ret = PTR_ERR(obj); >>> + goto fail_detach; >>> + } >>> + >>> + obj->import_attach = attach; >>> + obj->resv = dma_buf->resv; >>> + >>> + return obj; >>> + >>> +fail_detach: >>> + dma_buf_detach(dma_buf, attach); >>> + dma_buf_put(dma_buf); >>> + >>> + return ERR_PTR(ret); >>> +} >>> +EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); >>> + >>> /** >>> * drm_gem_prime_import_dev - core implementation of the import callback >>> * @dev: drm_device to import into >>> @@ -946,9 +1002,6 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, >>> } >>> } >>> - if (!dev->driver->gem_prime_import_sg_table) >>> - return ERR_PTR(-EINVAL); >>> - >>> attach = dma_buf_attach(dma_buf, attach_dev, false); >>> if (IS_ERR(attach)) >>> return ERR_CAST(attach); >>> @@ -998,7 +1051,13 @@ EXPORT_SYMBOL(drm_gem_prime_import_dev); >>> struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, >>> struct dma_buf *dma_buf) >>> { >>> - return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); >>> + if (dev->driver->gem_prime_import_sg_table) >>> + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); >>> + else if (dev->driver->gem_prime_import_attachment) >>> + return drm_gem_prime_import_dev_skip_map(dev, dma_buf, dev->dev); >>> + else >>> + return ERR_PTR(-EINVAL); >>> + >>> } >>> EXPORT_SYMBOL(drm_gem_prime_import); >>> diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c >>> index 05b3a152cc33..c00d8b8834f2 100644 >>> --- a/drivers/gpu/drm/udl/udl_drv.c >>> +++ b/drivers/gpu/drm/udl/udl_drv.c >>> @@ -72,7 +72,7 @@ static const struct drm_driver driver = { >>> /* GEM hooks */ >>> .fops = &udl_driver_fops, >>> - DRM_GEM_SHMEM_DRIVER_OPS, >>> + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, >>> .gem_prime_import = udl_driver_gem_prime_import, >>> DRM_FBDEV_SHMEM_DRIVER_OPS, >>> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h >>> index a43d707b5f36..aef8d9051fcd 100644 >>> --- a/include/drm/drm_drv.h >>> +++ b/include/drm/drm_drv.h >>> @@ -326,6 +326,9 @@ struct drm_driver { >>> struct dma_buf_attachment *attach, >>> struct sg_table *sgt); >>> + struct drm_gem_object *(*gem_prime_import_attachment)( >>> + struct drm_device *dev, >>> + struct dma_buf_attachment *attach); >>> /** >>> * @dumb_create: >>> * >>> diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h >>> index cef5a6b5a4d6..39a93c222aaa 100644 >>> --- a/include/drm/drm_gem_shmem_helper.h >>> +++ b/include/drm/drm_gem_shmem_helper.h >>> @@ -274,6 +274,9 @@ struct drm_gem_object * >>> drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, >>> struct dma_buf_attachment *attach, >>> struct sg_table *sgt); >>> +struct drm_gem_object * >>> +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, >>> + struct dma_buf_attachment *attach); >>> int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> struct drm_mode_create_dumb *args); >>> @@ -287,4 +290,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, \ >>> .dumb_create = drm_gem_shmem_dumb_create >>> +#define DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS \ >>> + .gem_prime_import_attachment = drm_gem_shmem_prime_import_attachment, \ >>> + .dumb_create = drm_gem_shmem_dumb_create >>> #endif /* __DRM_GEM_SHMEM_HELPER_H__ */

9 months, 2 weeks

1
0
0 0

Re: [PATCH 2/3] drm/prime: Support importing DMA-BUF without sg_table

by Christian König

On 4/30/25 10:56, oushixiong1025(a)163.com wrote: > From: Shixiong Ou <oushixiong(a)kylinos.cn> > > [WHY] > On some boards, the dma_mask of Aspeed devices is 0xffff_ffff, this > quite possibly causes the SWIOTLB to be triggered when importing dmabuf. > However IO_TLB_SEGSIZE limits the maximum amount of available memory > for DMA Streaming Mapping, as dmesg following: > > [ 24.885303][ T1947] ast 0000:07:00.0: swiotlb buffer is full (sz: 3145728 bytes), total 32768 (slots), used 0 (slots) > > [HOW] Provide an interface so that attachment is not mapped when > importing dma-buf. This is unecessary. The extra abstraction in DRM is only useful when you want to implement the obj->funcs->get_sg_table() callback. When a driver doesn't want to expose an sg_table for a buffer or want some other special handling it can simply do so by implementing the DMA-buf interface directly. See drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c for an example on how to do this. Regards, Christian. > > Signed-off-by: Shixiong Ou <oushixiong(a)kylinos.cn> > --- > drivers/gpu/drm/ast/ast_drv.c | 2 +- > drivers/gpu/drm/drm_gem_shmem_helper.c | 17 +++++++ > drivers/gpu/drm/drm_prime.c | 67 ++++++++++++++++++++++++-- > drivers/gpu/drm/udl/udl_drv.c | 2 +- > include/drm/drm_drv.h | 3 ++ > include/drm/drm_gem_shmem_helper.h | 6 +++ > 6 files changed, 91 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c > index 6fbf62a99c48..2dac6acf79e7 100644 > --- a/drivers/gpu/drm/ast/ast_drv.c > +++ b/drivers/gpu/drm/ast/ast_drv.c > @@ -64,7 +64,7 @@ static const struct drm_driver ast_driver = { > .minor = DRIVER_MINOR, > .patchlevel = DRIVER_PATCHLEVEL, > > - DRM_GEM_SHMEM_DRIVER_OPS, > + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, > DRM_FBDEV_SHMEM_DRIVER_OPS, > }; > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index d99dee67353a..655d841df933 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -799,6 +799,23 @@ drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, > } > EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_sg_table); > > +struct drm_gem_object * > +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, > + struct dma_buf_attachment *attach) > +{ > + size_t size = PAGE_ALIGN(attach->dmabuf->size); > + struct drm_gem_shmem_object *shmem; > + > + shmem = __drm_gem_shmem_create(dev, size, true, NULL); > + if (IS_ERR(shmem)) > + return ERR_CAST(shmem); > + > + drm_dbg_prime(dev, "size = %zu\n", size); > + > + return &shmem->base; > +} > +EXPORT_SYMBOL_GPL(drm_gem_shmem_prime_import_attachment); > + > MODULE_DESCRIPTION("DRM SHMEM memory-management helpers"); > MODULE_IMPORT_NS("DMA_BUF"); > MODULE_LICENSE("GPL v2"); > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index 8e70abca33b9..522cf974e202 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -911,6 +911,62 @@ struct dma_buf *drm_gem_prime_export(struct drm_gem_object *obj, > } > EXPORT_SYMBOL(drm_gem_prime_export); > > +/** > + * drm_gem_prime_import_dev_skip_map - core implementation of the import callback > + * @dev: drm_device to import into > + * @dma_buf: dma-buf object to import > + * @attach_dev: struct device to dma_buf attach > + * > + * This function exports a dma-buf without get it's scatter/gather table. > + * > + * Drivers who need to get an scatter/gather table for objects need to call > + * drm_gem_prime_import_dev() instead. > + */ > +struct drm_gem_object *drm_gem_prime_import_dev_skip_map(struct drm_device *dev, > + struct dma_buf *dma_buf, > + struct device *attach_dev) > +{ > + struct dma_buf_attachment *attach; > + struct drm_gem_object *obj; > + int ret; > + > + if (dma_buf->ops == &drm_gem_prime_dmabuf_ops) { > + obj = dma_buf->priv; > + if (obj->dev == dev) { > + /* > + * Importing dmabuf exported from our own gem increases > + * refcount on gem itself instead of f_count of dmabuf. > + */ > + drm_gem_object_get(obj); > + return obj; > + } > + } > + > + attach = dma_buf_attach(dma_buf, attach_dev, true); > + if (IS_ERR(attach)) > + return ERR_CAST(attach); > + > + get_dma_buf(dma_buf); > + > + obj = dev->driver->gem_prime_import_attachment(dev, attach); > + if (IS_ERR(obj)) { > + ret = PTR_ERR(obj); > + goto fail_detach; > + } > + > + obj->import_attach = attach; > + obj->resv = dma_buf->resv; > + > + return obj; > + > +fail_detach: > + dma_buf_detach(dma_buf, attach); > + dma_buf_put(dma_buf); > + > + return ERR_PTR(ret); > +} > +EXPORT_SYMBOL(drm_gem_prime_import_dev_skip_map); > + > /** > * drm_gem_prime_import_dev - core implementation of the import callback > * @dev: drm_device to import into > @@ -946,9 +1002,6 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, > } > } > > - if (!dev->driver->gem_prime_import_sg_table) > - return ERR_PTR(-EINVAL); > - > attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > @@ -998,7 +1051,13 @@ EXPORT_SYMBOL(drm_gem_prime_import_dev); > struct drm_gem_object *drm_gem_prime_import(struct drm_device *dev, > struct dma_buf *dma_buf) > { > - return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); > + if (dev->driver->gem_prime_import_sg_table) > + return drm_gem_prime_import_dev(dev, dma_buf, dev->dev); > + else if (dev->driver->gem_prime_import_attachment) > + return drm_gem_prime_import_dev_skip_map(dev, dma_buf, dev->dev); > + else > + return ERR_PTR(-EINVAL); > + > } > EXPORT_SYMBOL(drm_gem_prime_import); > > diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c > index 05b3a152cc33..c00d8b8834f2 100644 > --- a/drivers/gpu/drm/udl/udl_drv.c > +++ b/drivers/gpu/drm/udl/udl_drv.c > @@ -72,7 +72,7 @@ static const struct drm_driver driver = { > > /* GEM hooks */ > .fops = &udl_driver_fops, > - DRM_GEM_SHMEM_DRIVER_OPS, > + DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS, > .gem_prime_import = udl_driver_gem_prime_import, > DRM_FBDEV_SHMEM_DRIVER_OPS, > > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h > index a43d707b5f36..aef8d9051fcd 100644 > --- a/include/drm/drm_drv.h > +++ b/include/drm/drm_drv.h > @@ -326,6 +326,9 @@ struct drm_driver { > struct dma_buf_attachment *attach, > struct sg_table *sgt); > > + struct drm_gem_object *(*gem_prime_import_attachment)( > + struct drm_device *dev, > + struct dma_buf_attachment *attach); > /** > * @dumb_create: > * > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > index cef5a6b5a4d6..39a93c222aaa 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -274,6 +274,9 @@ struct drm_gem_object * > drm_gem_shmem_prime_import_sg_table(struct drm_device *dev, > struct dma_buf_attachment *attach, > struct sg_table *sgt); > +struct drm_gem_object * > +drm_gem_shmem_prime_import_attachment(struct drm_device *dev, > + struct dma_buf_attachment *attach); > int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, > struct drm_mode_create_dumb *args); > > @@ -287,4 +290,7 @@ int drm_gem_shmem_dumb_create(struct drm_file *file, struct drm_device *dev, > .gem_prime_import_sg_table = drm_gem_shmem_prime_import_sg_table, \ > .dumb_create = drm_gem_shmem_dumb_create > > +#define DRM_GEM_SHMEM_SIMPLE_DRIVER_OPS \ > + .gem_prime_import_attachment = drm_gem_shmem_prime_import_attachment, \ > + .dumb_create = drm_gem_shmem_dumb_create > #endif /* __DRM_GEM_SHMEM_HELPER_H__ */

9 months, 2 weeks

1
0
0 0

Re: [PATCH 1/3] dma-buf: add flags to skip map_dma_buf() for some drivers

by Christian König

On 4/30/25 10:56, oushixiong1025(a)163.com wrote: > From: Shixiong Ou <oushixiong(a)kylinos.cn> > > [WHY] Some Importer does not need to call dma_buf_map_attachment() to > get the scatterlist info, especially those drivers of hardware that do > not support DMA, such as the udl, the virtgpu and the ast. > > [HOW] skip map_dma_buf() when dma_buf_dynamic_attach() for some drivers. This patch is based on outdated code. Please see drm-misc-next where the mapping during attach was already dropped. commit b72f66f22c0e39ae6684c43fead774c13db24e73 Author: Christian König <christian.koenig(a)amd.com> Date: Tue Feb 11 17:20:53 2025 +0100 dma-buf: drop caching of sg_tables That was purely for the transition from static to dynamic dma-buf handling and can be removed again now. Regards, Christian. > Signed-off-by: Shixiong Ou <oushixiong(a)kylinos.cn> > --- > drivers/accel/ivpu/ivpu_gem.c | 2 +- > drivers/accel/qaic/qaic_data.c | 2 +- > drivers/dma-buf/dma-buf.c | 29 ++++++++++--------- > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +- > drivers/gpu/drm/armada/armada_gem.c | 2 +- > drivers/gpu/drm/drm_prime.c | 2 +- > drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 2 +- > .../drm/i915/gem/selftests/i915_gem_dmabuf.c | 2 +- > drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c | 2 +- > drivers/gpu/drm/tegra/gem.c | 4 +-- > drivers/gpu/drm/virtio/virtgpu_prime.c | 2 +- > drivers/gpu/drm/xe/xe_dma_buf.c | 2 +- > drivers/iio/industrialio-buffer.c | 2 +- > drivers/infiniband/core/umem_dmabuf.c | 3 +- > .../common/videobuf2/videobuf2-dma-contig.c | 2 +- > .../media/common/videobuf2/videobuf2-dma-sg.c | 2 +- > .../platform/nvidia/tegra-vde/dmabuf-cache.c | 2 +- > drivers/misc/fastrpc.c | 2 +- > drivers/usb/gadget/function/f_fs.c | 2 +- > drivers/xen/gntdev-dmabuf.c | 2 +- > include/linux/dma-buf.h | 5 ++-- > net/core/devmem.c | 2 +- > 22 files changed, 41 insertions(+), 36 deletions(-) > > diff --git a/drivers/accel/ivpu/ivpu_gem.c b/drivers/accel/ivpu/ivpu_gem.c > index 8741c73b92ce..5258a66ed945 100644 > --- a/drivers/accel/ivpu/ivpu_gem.c > +++ b/drivers/accel/ivpu/ivpu_gem.c > @@ -183,7 +183,7 @@ struct drm_gem_object *ivpu_gem_prime_import(struct drm_device *dev, > struct drm_gem_object *obj; > int ret; > > - attach = dma_buf_attach(dma_buf, attach_dev); > + attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/accel/qaic/qaic_data.c b/drivers/accel/qaic/qaic_data.c > index 43aba57b48f0..c13c64d59143 100644 > --- a/drivers/accel/qaic/qaic_data.c > +++ b/drivers/accel/qaic/qaic_data.c > @@ -803,7 +803,7 @@ struct drm_gem_object *qaic_gem_prime_import(struct drm_device *dev, struct dma_ > obj = &bo->base; > get_dma_buf(dma_buf); > > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) { > ret = PTR_ERR(attach); > goto attach_fail; > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 5baa83b85515..dd7fe5fbf197 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -904,7 +904,7 @@ static struct sg_table *__map_dma_buf(struct dma_buf_attachment *attach, > struct dma_buf_attachment * > dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > const struct dma_buf_attach_ops *importer_ops, > - void *importer_priv) > + void *importer_priv, bool skip_map) > { > struct dma_buf_attachment *attach; > int ret; > @@ -941,8 +941,6 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > */ > if (dma_buf_attachment_is_dynamic(attach) != > dma_buf_is_dynamic(dmabuf)) { > - struct sg_table *sgt; > - > dma_resv_lock(attach->dmabuf->resv, NULL); > if (dma_buf_is_dynamic(attach->dmabuf)) { > ret = dmabuf->ops->pin(attach); > @@ -950,16 +948,20 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > goto err_unlock; > } > > - sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); > - if (!sgt) > - sgt = ERR_PTR(-ENOMEM); > - if (IS_ERR(sgt)) { > - ret = PTR_ERR(sgt); > - goto err_unpin; > + if (!skip_map) { > + struct sg_table *sgt; > + > + sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); > + if (!sgt) > + sgt = ERR_PTR(-ENOMEM); > + if (IS_ERR(sgt)) { > + ret = PTR_ERR(sgt); > + goto err_unpin; > + } > + attach->sgt = sgt; > + attach->dir = DMA_BIDIRECTIONAL; > } > dma_resv_unlock(attach->dmabuf->resv); > - attach->sgt = sgt; > - attach->dir = DMA_BIDIRECTIONAL; > } > > return attach; > @@ -989,9 +991,10 @@ EXPORT_SYMBOL_NS_GPL(dma_buf_dynamic_attach, "DMA_BUF"); > * mapping. > */ > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > - struct device *dev) > + struct device *dev, > + bool skip_map) > { > - return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL); > + return dma_buf_dynamic_attach(dmabuf, dev, NULL, NULL, skip_map); > } > EXPORT_SYMBOL_NS_GPL(dma_buf_attach, "DMA_BUF"); > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > index e6913fcf2c7b..26c94834e6d2 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c > @@ -479,7 +479,7 @@ struct drm_gem_object *amdgpu_gem_prime_import(struct drm_device *dev, > return obj; > > attach = dma_buf_dynamic_attach(dma_buf, dev->dev, > - &amdgpu_dma_buf_attach_ops, obj); > + &amdgpu_dma_buf_attach_ops, obj, false); > if (IS_ERR(attach)) { > drm_gem_object_put(obj); > return ERR_CAST(attach); > diff --git a/drivers/gpu/drm/armada/armada_gem.c b/drivers/gpu/drm/armada/armada_gem.c > index 1a1680d71486..7e1a82828b87 100644 > --- a/drivers/gpu/drm/armada/armada_gem.c > +++ b/drivers/gpu/drm/armada/armada_gem.c > @@ -514,7 +514,7 @@ armada_gem_prime_import(struct drm_device *dev, struct dma_buf *buf) > } > } > > - attach = dma_buf_attach(buf, dev->dev); > + attach = dma_buf_attach(buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/drm_prime.c b/drivers/gpu/drm/drm_prime.c > index bdb51c8f262e..8e70abca33b9 100644 > --- a/drivers/gpu/drm/drm_prime.c > +++ b/drivers/gpu/drm/drm_prime.c > @@ -949,7 +949,7 @@ struct drm_gem_object *drm_gem_prime_import_dev(struct drm_device *dev, > if (!dev->driver->gem_prime_import_sg_table) > return ERR_PTR(-EINVAL); > > - attach = dma_buf_attach(dma_buf, attach_dev); > + attach = dma_buf_attach(dma_buf, attach_dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > index 9473050ac842..6015f6beb8e6 100644 > --- a/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c > @@ -305,7 +305,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev, > return ERR_PTR(-E2BIG); > > /* need to attach */ > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > index 2fda549dd82d..1992241fdf54 100644 > --- a/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > +++ b/drivers/gpu/drm/i915/gem/selftests/i915_gem_dmabuf.c > @@ -287,7 +287,7 @@ static int igt_dmabuf_import_same_driver(struct drm_i915_private *i915, > goto out_import; > > /* Now try a fake an importer */ > - import_attach = dma_buf_attach(dmabuf, obj->base.dev->dev); > + import_attach = dma_buf_attach(dmabuf, obj->base.dev->dev, false); > if (IS_ERR(import_attach)) { > err = PTR_ERR(import_attach); > goto out_import; > diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > index 30cf1cdc1aa3..41fb4149409e 100644 > --- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > +++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c > @@ -114,7 +114,7 @@ struct drm_gem_object *omap_gem_prime_import(struct drm_device *dev, > } > } > > - attach = dma_buf_attach(dma_buf, dev->dev); > + attach = dma_buf_attach(dma_buf, dev->dev, false); > if (IS_ERR(attach)) > return ERR_CAST(attach); > > diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c > index ace3e5a805cf..e5527c9d10bb 100644 > --- a/drivers/gpu/drm/tegra/gem.c > +++ b/drivers/gpu/drm/tegra/gem.c > @@ -79,7 +79,7 @@ static struct host1x_bo_mapping *tegra_bo_pin(struct device *dev, struct host1x_ > if (obj->dma_buf) { > struct dma_buf *buf = obj->dma_buf; > > - map->attach = dma_buf_attach(buf, dev); > + map->attach = dma_buf_attach(buf, dev, false); > if (IS_ERR(map->attach)) { > err = PTR_ERR(map->attach); > goto free; > @@ -470,7 +470,7 @@ static struct tegra_bo *tegra_bo_import(struct drm_device *drm, > * domain, map it first to the DRM device to get an sgt. > */ > if (tegra->domain) { > - attach = dma_buf_attach(buf, drm->dev); > + attach = dma_buf_attach(buf, drm->dev, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto free; > diff --git a/drivers/gpu/drm/virtio/virtgpu_prime.c b/drivers/gpu/drm/virtio/virtgpu_prime.c > index 4de2a63ccd18..6d9d1fe342b6 100644 > --- a/drivers/gpu/drm/virtio/virtgpu_prime.c > +++ b/drivers/gpu/drm/virtio/virtgpu_prime.c > @@ -326,7 +326,7 @@ struct drm_gem_object *virtgpu_gem_prime_import(struct drm_device *dev, > drm_gem_private_object_init(dev, obj, buf->size); > > attach = dma_buf_dynamic_attach(buf, dev->dev, > - &virtgpu_dma_buf_attach_ops, obj); > + &virtgpu_dma_buf_attach_ops, obj, true); > if (IS_ERR(attach)) { > kfree(bo); > return ERR_CAST(attach); > diff --git a/drivers/gpu/drm/xe/xe_dma_buf.c b/drivers/gpu/drm/xe/xe_dma_buf.c > index f7a20264ea33..9f524b9ed425 100644 > --- a/drivers/gpu/drm/xe/xe_dma_buf.c > +++ b/drivers/gpu/drm/xe/xe_dma_buf.c > @@ -293,7 +293,7 @@ struct drm_gem_object *xe_gem_prime_import(struct drm_device *dev, > attach_ops = test->attach_ops; > #endif > > - attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base); > + attach = dma_buf_dynamic_attach(dma_buf, dev->dev, attach_ops, &bo->ttm.base, false); > if (IS_ERR(attach)) { > obj = ERR_CAST(attach); > goto out_err; > diff --git a/drivers/iio/industrialio-buffer.c b/drivers/iio/industrialio-buffer.c > index a80f7cc25a27..1296af4c2f7a 100644 > --- a/drivers/iio/industrialio-buffer.c > +++ b/drivers/iio/industrialio-buffer.c > @@ -1679,7 +1679,7 @@ static int iio_buffer_attach_dmabuf(struct iio_dev_buffer_pair *ib, > goto err_free_priv; > } > > - attach = dma_buf_attach(dmabuf, indio_dev->dev.parent); > + attach = dma_buf_attach(dmabuf, indio_dev->dev.parent, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto err_dmabuf_put; > diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c > index 0ec2e4120cc9..ed635c407cbd 100644 > --- a/drivers/infiniband/core/umem_dmabuf.c > +++ b/drivers/infiniband/core/umem_dmabuf.c > @@ -159,7 +159,8 @@ ib_umem_dmabuf_get_with_dma_device(struct ib_device *device, > dmabuf, > dma_device, > ops, > - umem_dmabuf); > + umem_dmabuf, > + false); > if (IS_ERR(umem_dmabuf->attach)) { > ret = ERR_CAST(umem_dmabuf->attach); > goto out_free_umem; > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index a13ec569c82f..362f5b555ce2 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -786,7 +786,7 @@ static void *vb2_dc_attach_dmabuf(struct vb2_buffer *vb, struct device *dev, > buf->vb = vb; > > /* create attachment for the dmabuf with the user device */ > - dba = dma_buf_attach(dbuf, buf->dev); > + dba = dma_buf_attach(dbuf, buf->dev, false); > if (IS_ERR(dba)) { > pr_err("failed to attach dmabuf\n"); > kfree(buf); > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-sg.c b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > index c6ddf2357c58..4f9a4e9783a1 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-sg.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-sg.c > @@ -632,7 +632,7 @@ static void *vb2_dma_sg_attach_dmabuf(struct vb2_buffer *vb, struct device *dev, > > buf->dev = dev; > /* create attachment for the dmabuf with the user device */ > - dba = dma_buf_attach(dbuf, buf->dev); > + dba = dma_buf_attach(dbuf, buf->dev, false); > if (IS_ERR(dba)) { > pr_err("failed to attach dmabuf\n"); > kfree(buf); > diff --git a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > index b34244ea14dd..d04da2d3e4da 100644 > --- a/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > +++ b/drivers/media/platform/nvidia/tegra-vde/dmabuf-cache.c > @@ -95,7 +95,7 @@ int tegra_vde_dmabuf_cache_map(struct tegra_vde *vde, > goto ref; > } > > - attachment = dma_buf_attach(dmabuf, dev); > + attachment = dma_buf_attach(dmabuf, dev, false); > if (IS_ERR(attachment)) { > dev_err(dev, "Failed to attach dmabuf\n"); > err = PTR_ERR(attachment); > diff --git a/drivers/misc/fastrpc.c b/drivers/misc/fastrpc.c > index 7b7a22c91fe4..aee6f4cbd6c6 100644 > --- a/drivers/misc/fastrpc.c > +++ b/drivers/misc/fastrpc.c > @@ -778,7 +778,7 @@ static int fastrpc_map_create(struct fastrpc_user *fl, int fd, > goto get_err; > } > > - map->attach = dma_buf_attach(map->buf, sess->dev); > + map->attach = dma_buf_attach(map->buf, sess->dev, false); > if (IS_ERR(map->attach)) { > dev_err(sess->dev, "Failed to attach dmabuf\n"); > err = PTR_ERR(map->attach); > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c > index 2dea9e42a0f8..51926ffdb843 100644 > --- a/drivers/usb/gadget/function/f_fs.c > +++ b/drivers/usb/gadget/function/f_fs.c > @@ -1487,7 +1487,7 @@ static int ffs_dmabuf_attach(struct file *file, int fd) > if (IS_ERR(dmabuf)) > return PTR_ERR(dmabuf); > > - attach = dma_buf_attach(dmabuf, gadget->dev.parent); > + attach = dma_buf_attach(dmabuf, gadget->dev.parent, false); > if (IS_ERR(attach)) { > err = PTR_ERR(attach); > goto err_dmabuf_put; > diff --git a/drivers/xen/gntdev-dmabuf.c b/drivers/xen/gntdev-dmabuf.c > index 5453d86324f6..9de191b6d1f7 100644 > --- a/drivers/xen/gntdev-dmabuf.c > +++ b/drivers/xen/gntdev-dmabuf.c > @@ -587,7 +587,7 @@ dmabuf_imp_to_refs(struct gntdev_dmabuf_priv *priv, struct device *dev, > gntdev_dmabuf->priv = priv; > gntdev_dmabuf->fd = fd; > > - attach = dma_buf_attach(dma_buf, dev); > + attach = dma_buf_attach(dma_buf, dev, false); > if (IS_ERR(attach)) { > ret = ERR_CAST(attach); > goto fail_free_obj; > diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h > index 36216d28d8bd..1ea25089b3ba 100644 > --- a/include/linux/dma-buf.h > +++ b/include/linux/dma-buf.h > @@ -598,11 +598,12 @@ dma_buf_attachment_is_dynamic(struct dma_buf_attachment *attach) > } > > struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > - struct device *dev); > + struct device *dev, > + bool skip_map); > struct dma_buf_attachment * > dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, > const struct dma_buf_attach_ops *importer_ops, > - void *importer_priv); > + void *importer_priv, bool skip_map); > void dma_buf_detach(struct dma_buf *dmabuf, > struct dma_buf_attachment *attach); > int dma_buf_pin(struct dma_buf_attachment *attach); > diff --git a/net/core/devmem.c b/net/core/devmem.c > index 6e27a47d0493..8137ecff9e39 100644 > --- a/net/core/devmem.c > +++ b/net/core/devmem.c > @@ -202,7 +202,7 @@ net_devmem_bind_dmabuf(struct net_device *dev, unsigned int dmabuf_fd, > > binding->dmabuf = dmabuf; > > - binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent); > + binding->attachment = dma_buf_attach(binding->dmabuf, dev->dev.parent, false); > if (IS_ERR(binding->attachment)) { > err = PTR_ERR(binding->attachment); > NL_SET_ERR_MSG(extack, "Failed to bind dmabuf to device");

9 months, 2 weeks

1
0
0 0

[PATCH] dma-buf: system_heap: No separate allocation for attachment sg_tables

by T.J. Mercier

struct dma_heap_attachment is a separate allocation from the struct sg_table it contains, but there is no reason for this. Let's use the slab allocator just once instead of twice for dma_heap_attachment. Signed-off-by: T.J. Mercier <tjmercier(a)google.com> --- drivers/dma-buf/heaps/system_heap.c | 43 ++++++++++++----------------- 1 file changed, 17 insertions(+), 26 deletions(-) diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 26d5dc89ea16..bee10c400cf0 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -35,7 +35,7 @@ struct system_heap_buffer { struct dma_heap_attachment { struct device *dev; - struct sg_table *table; + struct sg_table table; struct list_head list; bool mapped; }; @@ -54,29 +54,22 @@ static gfp_t order_flags[] = {HIGH_ORDER_GFP, HIGH_ORDER_GFP, LOW_ORDER_GFP}; static const unsigned int orders[] = {8, 4, 0}; #define NUM_ORDERS ARRAY_SIZE(orders) -static struct sg_table *dup_sg_table(struct sg_table *table) +static int dup_sg_table(struct sg_table *from, struct sg_table *to) { - struct sg_table *new_table; - int ret, i; struct scatterlist *sg, *new_sg; + int ret, i; - new_table = kzalloc(sizeof(*new_table), GFP_KERNEL); - if (!new_table) - return ERR_PTR(-ENOMEM); - - ret = sg_alloc_table(new_table, table->orig_nents, GFP_KERNEL); - if (ret) { - kfree(new_table); - return ERR_PTR(-ENOMEM); - } + ret = sg_alloc_table(to, from->orig_nents, GFP_KERNEL); + if (ret) + return ret; - new_sg = new_table->sgl; - for_each_sgtable_sg(table, sg, i) { + new_sg = to->sgl; + for_each_sgtable_sg(from, sg, i) { sg_set_page(new_sg, sg_page(sg), sg->length, sg->offset); new_sg = sg_next(new_sg); } - return new_table; + return 0; } static int system_heap_attach(struct dma_buf *dmabuf, @@ -84,19 +77,18 @@ static int system_heap_attach(struct dma_buf *dmabuf, { struct system_heap_buffer *buffer = dmabuf->priv; struct dma_heap_attachment *a; - struct sg_table *table; + int ret; a = kzalloc(sizeof(*a), GFP_KERNEL); if (!a) return -ENOMEM; - table = dup_sg_table(&buffer->sg_table); - if (IS_ERR(table)) { + ret = dup_sg_table(&buffer->sg_table, &a->table); + if (ret) { kfree(a); - return -ENOMEM; + return ret; } - a->table = table; a->dev = attachment->dev; INIT_LIST_HEAD(&a->list); a->mapped = false; @@ -120,8 +112,7 @@ static void system_heap_detach(struct dma_buf *dmabuf, list_del(&a->list); mutex_unlock(&buffer->lock); - sg_free_table(a->table); - kfree(a->table); + sg_free_table(&a->table); kfree(a); } @@ -129,7 +120,7 @@ static struct sg_table *system_heap_map_dma_buf(struct dma_buf_attachment *attac enum dma_data_direction direction) { struct dma_heap_attachment *a = attachment->priv; - struct sg_table *table = a->table; + struct sg_table *table = &a->table; int ret; ret = dma_map_sgtable(attachment->dev, table, direction, 0); @@ -164,7 +155,7 @@ static int system_heap_dma_buf_begin_cpu_access(struct dma_buf *dmabuf, list_for_each_entry(a, &buffer->attachments, list) { if (!a->mapped) continue; - dma_sync_sgtable_for_cpu(a->dev, a->table, direction); + dma_sync_sgtable_for_cpu(a->dev, &a->table, direction); } mutex_unlock(&buffer->lock); @@ -185,7 +176,7 @@ static int system_heap_dma_buf_end_cpu_access(struct dma_buf *dmabuf, list_for_each_entry(a, &buffer->attachments, list) { if (!a->mapped) continue; - dma_sync_sgtable_for_device(a->dev, a->table, direction); + dma_sync_sgtable_for_device(a->dev, &a->table, direction); } mutex_unlock(&buffer->lock); base-commit: 8ffd015db85fea3e15a77027fda6c02ced4d2444 -- 2.49.0.805.g082f7c87e0-goog

9 months, 2 weeks

2
3
0 0

Re: [PATCH 2/3] dma-buf: Add DMA_BUF_IOCTL_GET_DMA_ADDR

by Christian König

On 4/29/25 08:39, Simona Vetter wrote: > Catching up after spring break, hence the late reply ... > > On Fri, Apr 11, 2025 at 02:34:37PM -0400, Nicolas Dufresne wrote: >> Le jeudi 10 avril 2025 à 16:53 +0200, Bastien Curutchet a écrit : >>> There is no way to transmit the DMA address of a buffer to userspace. >>> Some UIO users need this to handle DMA from userspace. >> >> To me this API is against all safe practice we've been pushing forward >> and has no place in DMA_BUF API. >> >> If this is fine for the UIO subsystem to pass around physicial >> addresses, then make this part of the UIO device ioctl. > > Yeah, this has no business in dma-buf since the entire point of dma-buf > was to stop all the nasty "just pass raw dma addr in userspace" hacks that > preceeded it. > > And over the years since dma-buf landed, we've removed a lot of these, > like dri1 drivers. Or where that's not possible like with fbdev, hid the > raw dma addr uapi behind a Kconfig. > > I concur with the overall sentiment that this should be done in > vfio/iommufd interfaces, maybe with some support added to map dma-buf. I > think patches for that have been floating around for a while, but I lost a > bit the status of where exactly they are. My take away is that we need to have a documented way for special driver specific interfaces in DMA-buf. In other words DMA-buf has some standardized rules of doing things which every implementation should follow. The implementations might of course still have bugs (e.g. allocate memory for a dma_fence operation), but at least we have documented what should be done and what's forbidden. What is still missing in the documentation is the use case when you have for example vfio which wants to talk to iommufd through a specialized interface. This doesn't necessarily needs to be part of DMA-buf, but we should still document "do it this way" because that has already worked in the last ten use cases and we don't want people to re-invent the wheel in a new funky way which then later turns out to not work. Regards, Christian. > > Cheers, Sima > >> >> regards, >> Nicolas >> >>> >>> Add a new dma_buf_ops operation that returns the DMA address. >>> Add a new ioctl to transmit this DMA address to userspace. >>> >>> Signed-off-by: Bastien Curutchet <bastien.curutchet(a)bootlin.com> >>> --- >>> drivers/dma-buf/dma-buf.c | 21 +++++++++++++++++++++ >>> include/linux/dma-buf.h | 1 + >>> include/uapi/linux/dma-buf.h | 1 + >>> 3 files changed, 23 insertions(+) >>> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>> index >>> 398418bd9731ad7a3a1f12eaea6a155fa77a22fe..cbbb518981e54e50f479c3d1fcf >>> 6da6971f639c1 100644 >>> --- a/drivers/dma-buf/dma-buf.c >>> +++ b/drivers/dma-buf/dma-buf.c >>> @@ -454,6 +454,24 @@ static long dma_buf_import_sync_file(struct >>> dma_buf *dmabuf, >>> } >>> #endif >>> >>> +static int dma_buf_get_dma_addr(struct dma_buf *dmabuf, u64 __user >>> *arg) >>> +{ >>> + u64 addr; >>> + int ret; >>> + >>> + if (!dmabuf->ops->get_dma_addr) >>> + return -EINVAL; >>> + >>> + ret = dmabuf->ops->get_dma_addr(dmabuf, &addr); >>> + if (ret) >>> + return ret; >>> + >>> + if (copy_to_user(arg, &addr, sizeof(u64))) >>> + return -EFAULT; >>> + >>> + return 0; >>> +} >>> + >>> static long dma_buf_ioctl(struct file *file, >>> unsigned int cmd, unsigned long arg) >>> { >>> @@ -504,6 +522,9 @@ static long dma_buf_ioctl(struct file *file, >>> return dma_buf_import_sync_file(dmabuf, (const void >>> __user *)arg); >>> #endif >>> >>> + case DMA_BUF_IOCTL_GET_DMA_ADDR: >>> + return dma_buf_get_dma_addr(dmabuf, (u64 __user >>> *)arg); >>> + >>> default: >>> return -ENOTTY; >>> } >>> diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h >>> index >>> 36216d28d8bdc01a9c9c47e27c392413f7f6c5fb..ed4bf15d3ce82e7a86323fff459 >>> 699a9bc8baa3b 100644 >>> --- a/include/linux/dma-buf.h >>> +++ b/include/linux/dma-buf.h >>> @@ -285,6 +285,7 @@ struct dma_buf_ops { >>> >>> int (*vmap)(struct dma_buf *dmabuf, struct iosys_map *map); >>> void (*vunmap)(struct dma_buf *dmabuf, struct iosys_map >>> *map); >>> + int (*get_dma_addr)(struct dma_buf *dmabuf, u64 *addr); >>> }; >>> >>> /** >>> diff --git a/include/uapi/linux/dma-buf.h b/include/uapi/linux/dma- >>> buf.h >>> index >>> 5a6fda66d9adf01438619e7e67fa69f0fec2d88d..f3aba46942042de6a2e3a4cca3e >>> b3f87175e29c9 100644 >>> --- a/include/uapi/linux/dma-buf.h >>> +++ b/include/uapi/linux/dma-buf.h >>> @@ -178,5 +178,6 @@ struct dma_buf_import_sync_file { >>> #define DMA_BUF_SET_NAME_B _IOW(DMA_BUF_BASE, 1, __u64) >>> #define DMA_BUF_IOCTL_EXPORT_SYNC_FILE _IOWR(DMA_BUF_BASE, 2, >>> struct dma_buf_export_sync_file) >>> #define DMA_BUF_IOCTL_IMPORT_SYNC_FILE _IOW(DMA_BUF_BASE, 3, struct >>> dma_buf_import_sync_file) >>> +#define DMA_BUF_IOCTL_GET_DMA_ADDR _IOR(DMA_BUF_BASE, 4, __u64 >>> *) >>> >>> #endif >

9 months, 3 weeks

1
0
0 0

[PATCH v3 00/33] drm/msm: sparse / "VM_BIND" support

by Rob Clark

From: Rob Clark <robdclark(a)chromium.org> Conversion to DRM GPU VA Manager[1], and adding support for Vulkan Sparse Memory[2] in the form of: 1. A new VM_BIND submitqueue type for executing VM MSM_SUBMIT_BO_OP_MAP/ MAP_NULL/UNMAP commands 2. A new VM_BIND ioctl to allow submitting batches of one or more MAP/MAP_NULL/UNMAP commands to a VM_BIND submitqueue I did not implement support for synchronous VM_BIND commands. Since userspace could just immediately wait for the `SUBMIT` to complete, I don't think we need this extra complexity in the kernel. Synchronous/immediate VM_BIND operations could be implemented with a 2nd VM_BIND submitqueue. The corresponding mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/32533 Changes in v3: - Switched to seperate VM_BIND ioctl. This makes the UABI a bit cleaner, but OTOH the userspace code was cleaner when the end result of either type of VkQueue lead to the same ioctl. So I'm a bit on the fence. - Switched to doing the gpuvm bookkeeping synchronously, and only deferring the pgtable updates. This avoids needing to hold any resv locks in the fence signaling path, resolving the last shrinker related lockdep complaints. OTOH it means userspace can trigger invalid pgtable updates with multiple VM_BIND queues. In this case, we ensure that unmaps happen completely (to prevent userspace from using this to access free'd pages), mark the context as unusable, and move on with life. - Link to v2: https://lore.kernel.org/all/20250319145425.51935-1-robdclark@gmail.com/ Changes in v2: - Dropped Bibek Kumar Patro's arm-smmu patches[3], which have since been merged. - Pre-allocate all the things, and drop HACK patch which disabled shrinker. This includes ensuring that vm_bo objects are allocated up front, pre- allocating VMA objects, and pre-allocating pages used for pgtable updates. The latter utilizes io_pgtable_cfg callbacks for pgtable alloc/free, that were initially added for panthor. - Add back support for BO dumping for devcoredump. - Link to v1 (RFC): https://lore.kernel.org/dri-devel/20241207161651.410556-1-robdclark@gmail.c… [1] https://www.kernel.org/doc/html/next/gpu/drm-mm.html#drm-gpuvm [2] https://docs.vulkan.org/spec/latest/chapters/sparsemem.html [3] https://patchwork.kernel.org/project/linux-arm-kernel/list/?series=909700 Rob Clark (33): drm/gpuvm: Don't require obj lock in destructor path drm/gpuvm: Allow VAs to hold soft reference to BOs iommu/io-pgtable-arm: Add quirk to quiet WARN_ON() drm/msm: Rename msm_file_private -> msm_context drm/msm: Improve msm_context comments drm/msm: Rename msm_gem_address_space -> msm_gem_vm drm/msm: Remove vram carveout support drm/msm: Collapse vma allocation and initialization drm/msm: Collapse vma close and delete drm/msm: Don't close VMAs on purge drm/msm: drm_gpuvm conversion drm/msm: Convert vm locking drm/msm: Use drm_gpuvm types more drm/msm: Split out helper to get iommu prot flags drm/msm: Add mmu support for non-zero offset drm/msm: Add PRR support drm/msm: Rename msm_gem_vma_purge() -> _unmap() drm/msm: Lazily create context VM drm/msm: Add opt-in for VM_BIND drm/msm: Mark VM as unusable on GPU hangs drm/msm: Add _NO_SHARE flag drm/msm: Crashdump prep for sparse mappings drm/msm: rd dumping prep for sparse mappings drm/msm: Crashdec support for sparse drm/msm: rd dumping support for sparse drm/msm: Extract out syncobj helpers drm/msm: Use DMA_RESV_USAGE_BOOKKEEP/KERNEL drm/msm: Add VM_BIND submitqueue drm/msm: Support IO_PGTABLE_QUIRK_NO_WARN_ON drm/msm: Support pgtable preallocation drm/msm: Split out map/unmap ops drm/msm: Add VM_BIND ioctl drm/msm: Bump UAPI version drivers/gpu/drm/drm_gpuvm.c | 15 +- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/Makefile | 1 + drivers/gpu/drm/msm/adreno/a2xx_gpu.c | 25 +- drivers/gpu/drm/msm/adreno/a2xx_gpummu.c | 5 +- drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 17 +- drivers/gpu/drm/msm/adreno/a5xx_debugfs.c | 4 +- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 22 +- drivers/gpu/drm/msm/adreno/a5xx_power.c | 2 +- drivers/gpu/drm/msm/adreno/a5xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 32 +- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 49 +- drivers/gpu/drm/msm/adreno/a6xx_gpu_state.c | 6 +- drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 10 +- drivers/gpu/drm/msm/adreno/adreno_device.c | 4 - drivers/gpu/drm/msm/adreno/adreno_gpu.c | 88 +- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 23 +- .../drm/msm/disp/dpu1/dpu_encoder_phys_wb.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_formats.h | 2 +- drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 18 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c | 14 +- drivers/gpu/drm/msm/disp/dpu1/dpu_plane.h | 4 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_crtc.c | 6 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_kms.c | 28 +- drivers/gpu/drm/msm/disp/mdp4/mdp4_plane.c | 12 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_crtc.c | 4 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_kms.c | 19 +- drivers/gpu/drm/msm/disp/mdp5/mdp5_plane.c | 12 +- drivers/gpu/drm/msm/dsi/dsi_host.c | 14 +- drivers/gpu/drm/msm/msm_drv.c | 183 +-- drivers/gpu/drm/msm/msm_drv.h | 35 +- drivers/gpu/drm/msm/msm_fb.c | 18 +- drivers/gpu/drm/msm/msm_fbdev.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 489 +++---- drivers/gpu/drm/msm/msm_gem.h | 217 ++- drivers/gpu/drm/msm/msm_gem_prime.c | 15 + drivers/gpu/drm/msm/msm_gem_shrinker.c | 4 +- drivers/gpu/drm/msm/msm_gem_submit.c | 295 ++-- drivers/gpu/drm/msm/msm_gem_vma.c | 1265 +++++++++++++++-- drivers/gpu/drm/msm/msm_gpu.c | 171 ++- drivers/gpu/drm/msm/msm_gpu.h | 132 +- drivers/gpu/drm/msm/msm_iommu.c | 298 +++- drivers/gpu/drm/msm/msm_kms.c | 18 +- drivers/gpu/drm/msm/msm_kms.h | 2 +- drivers/gpu/drm/msm/msm_mmu.h | 38 +- drivers/gpu/drm/msm/msm_rd.c | 62 +- drivers/gpu/drm/msm/msm_ringbuffer.c | 4 +- drivers/gpu/drm/msm/msm_submitqueue.c | 86 +- drivers/gpu/drm/msm/msm_syncobj.c | 172 +++ drivers/gpu/drm/msm/msm_syncobj.h | 37 + drivers/iommu/io-pgtable-arm.c | 18 +- include/drm/drm_gpuvm.h | 12 +- include/linux/io-pgtable.h | 8 + include/uapi/drm/msm_drm.h | 149 +- 57 files changed, 3012 insertions(+), 1216 deletions(-) create mode 100644 drivers/gpu/drm/msm/msm_syncobj.c create mode 100644 drivers/gpu/drm/msm/msm_syncobj.h -- 2.49.0

9 months, 3 weeks

1
4
0 0

Re: [PATCH 4/4] drm/nouveau: Check dma_fence in canonical way

by Christian König

On 4/24/25 15:02, Philipp Stanner wrote: > In nouveau_fence_done(), a fence is checked for being signaled by > manually evaluating the base fence's bits. This can be done in a > canonical manner through dma_fence_is_signaled(). > > Replace the bit-check with dma_fence_is_signaled(). > > Signed-off-by: Philipp Stanner <phasta(a)kernel.org> I think the bit check was used here as fast path optimization because we later call dma_fence_is_signaled() anyway. Feel free to add my acked-by, but honestly what nouveau does here looks rather suspicious to me. Regards, Christian. > --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > index fb9811938c82..d5654e26d5bc 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -253,7 +253,7 @@ nouveau_fence_done(struct nouveau_fence *fence) > struct nouveau_channel *chan; > unsigned long flags; > > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > + if (dma_fence_is_signaled(&fence->base)) > return true; > > spin_lock_irqsave(&fctx->lock, flags);

9 months, 3 weeks

1
0
0 0

Re: [PATCH 3/4] drm/nouveau: Simplify nouveau_fence_done()

by Christian König

On 4/24/25 15:02, Philipp Stanner wrote: > nouveau_fence_done() contains an if branch that checks whether a > nouveau_fence has either of the two existing nouveau_fence backend ops, > which will always evaluate to true. > > Remove the surplus check. > > Signed-off-by: Philipp Stanner <phasta(a)kernel.org> Reviewed-by: Christian König <christian.koenig(a)amd.com> > --- > drivers/gpu/drm/nouveau/nouveau_fence.c | 24 +++++++++++------------- > 1 file changed, 11 insertions(+), 13 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c > index 2b79bcb7da16..fb9811938c82 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c > @@ -249,21 +249,19 @@ nouveau_fence_emit(struct nouveau_fence *fence) > bool > nouveau_fence_done(struct nouveau_fence *fence) > { > - if (fence->base.ops == &nouveau_fence_ops_legacy || > - fence->base.ops == &nouveau_fence_ops_uevent) { > - struct nouveau_fence_chan *fctx = nouveau_fctx(fence); > - struct nouveau_channel *chan; > - unsigned long flags; > + struct nouveau_fence_chan *fctx = nouveau_fctx(fence); > + struct nouveau_channel *chan; > + unsigned long flags; > > - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > - return true; > + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags)) > + return true; > + > + spin_lock_irqsave(&fctx->lock, flags); > + chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); > + if (chan) > + nouveau_fence_update(chan, fctx); > + spin_unlock_irqrestore(&fctx->lock, flags); > > - spin_lock_irqsave(&fctx->lock, flags); > - chan = rcu_dereference_protected(fence->channel, lockdep_is_held(&fctx->lock)); > - if (chan) > - nouveau_fence_update(chan, fctx); > - spin_unlock_irqrestore(&fctx->lock, flags); > - } > return dma_fence_is_signaled(&fence->base); > } >

9 months, 3 weeks

1
0
0 0

Re: [PATCH v7 04/11] optee: sync secure world ABI headers

by Jens Wiklander

Hi Rouven, On Fri, Apr 25, 2025 at 3:36 PM Rouven Czerwinski <rouven.czerwinski(a)linaro.org> wrote: > > Hi, > > On Fri, 4 Apr 2025 at 16:31, Jens Wiklander <jens.wiklander(a)linaro.org> wrote: > > > > Update the header files describing the secure world ABI, both with and > > without FF-A. The ABI is extended to deal with protected memory, but as > > usual backward compatible. > > > > Signed-off-by: Jens Wiklander <jens.wiklander(a)linaro.org> > > --- > > drivers/tee/optee/optee_ffa.h | 27 +++++++++--- > > drivers/tee/optee/optee_msg.h | 83 ++++++++++++++++++++++++++++++----- > > drivers/tee/optee/optee_smc.h | 71 +++++++++++++++++++++++++++++- > > 3 files changed, 163 insertions(+), 18 deletions(-) > > > > diff --git a/drivers/tee/optee/optee_ffa.h b/drivers/tee/optee/optee_ffa.h > > index 257735ae5b56..cc257e7956a3 100644 > > --- a/drivers/tee/optee/optee_ffa.h > > +++ b/drivers/tee/optee/optee_ffa.h > > @@ -81,7 +81,7 @@ > > * as the second MSG arg struct for > > * OPTEE_FFA_YIELDING_CALL_WITH_ARG. > > * Bit[31:8]: Reserved (MBZ) > > - * w5: Bitfield of secure world capabilities OPTEE_FFA_SEC_CAP_* below, > > + * w5: Bitfield of OP-TEE capabilities OPTEE_FFA_SEC_CAP_* > > * w6: The maximum secure world notification number > > * w7: Not used (MBZ) > > */ > > @@ -94,6 +94,8 @@ > > #define OPTEE_FFA_SEC_CAP_ASYNC_NOTIF BIT(1) > > /* OP-TEE supports probing for RPMB device if needed */ > > #define OPTEE_FFA_SEC_CAP_RPMB_PROBE BIT(2) > > +/* OP-TEE supports Protected Memory for secure data path */ > > +#define OPTEE_FFA_SEC_CAP_PROTMEM BIT(3) > > > > #define OPTEE_FFA_EXCHANGE_CAPABILITIES OPTEE_FFA_BLOCKING_CALL(2) > > > > @@ -108,7 +110,7 @@ > > * > > * Return register usage: > > * w3: Error code, 0 on success > > - * w4-w7: Note used (MBZ) > > + * w4-w7: Not used (MBZ) > > */ > > #define OPTEE_FFA_UNREGISTER_SHM OPTEE_FFA_BLOCKING_CALL(3) > > > > @@ -119,16 +121,31 @@ > > * Call register usage: > > * w3: Service ID, OPTEE_FFA_ENABLE_ASYNC_NOTIF > > * w4: Notification value to request bottom half processing, should be > > - * less than OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE. > > + * less than OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE > > * w5-w7: Not used (MBZ) > > * > > * Return register usage: > > * w3: Error code, 0 on success > > - * w4-w7: Note used (MBZ) > > + * w4-w7: Not used (MBZ) > > */ > > #define OPTEE_FFA_ENABLE_ASYNC_NOTIF OPTEE_FFA_BLOCKING_CALL(5) > > > > -#define OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE 64 > > +#define OPTEE_FFA_MAX_ASYNC_NOTIF_VALUE 64 > > + > > +/* > > + * Release Protected memory > > + * > > + * Call register usage: > > + * w3: Service ID, OPTEE_FFA_RECLAIM_PROTMEM > > + * w4: Shared memory handle, lower bits > > + * w5: Shared memory handle, higher bits > > + * w6-w7: Not used (MBZ) > > + * > > + * Return register usage: > > + * w3: Error code, 0 on success > > + * w4-w7: Note used (MBZ) > > + */ > > +#define OPTEE_FFA_RELEASE_PROTMEM OPTEE_FFA_BLOCKING_CALL(8) > > > > /* > > * Call with struct optee_msg_arg as argument in the supplied shared memory > > diff --git a/drivers/tee/optee/optee_msg.h b/drivers/tee/optee/optee_msg.h > > index e8840a82b983..22d71d6f110d 100644 > > --- a/drivers/tee/optee/optee_msg.h > > +++ b/drivers/tee/optee/optee_msg.h > > @@ -133,13 +133,13 @@ struct optee_msg_param_rmem { > > }; > > > > /** > > - * struct optee_msg_param_fmem - ffa memory reference parameter > > + * struct optee_msg_param_fmem - FF-A memory reference parameter > > * @offs_lower: Lower bits of offset into shared memory reference > > * @offs_upper: Upper bits of offset into shared memory reference > > * @internal_offs: Internal offset into the first page of shared memory > > * reference > > * @size: Size of the buffer > > - * @global_id: Global identifier of Shared memory > > + * @global_id: Global identifier of the shared memory > > */ > > struct optee_msg_param_fmem { > > u32 offs_low; > > @@ -165,7 +165,7 @@ struct optee_msg_param_value { > > * @attr: attributes > > * @tmem: parameter by temporary memory reference > > * @rmem: parameter by registered memory reference > > - * @fmem: parameter by ffa registered memory reference > > + * @fmem: parameter by FF-A registered memory reference > > * @value: parameter by opaque value > > * @octets: parameter by octet string > > * > > @@ -296,6 +296,18 @@ struct optee_msg_arg { > > */ > > #define OPTEE_MSG_FUNCID_GET_OS_REVISION 0x0001 > > > > +/* > > + * Values used in OPTEE_MSG_CMD_LEND_PROTMEM below > > + * OPTEE_MSG_PROTMEM_RESERVED Reserved > > + * OPTEE_MSG_PROTMEM_SECURE_VIDEO_PLAY Secure Video Playback > > + * OPTEE_MSG_PROTMEM_TRUSTED_UI Trused UI > > + * OPTEE_MSG_PROTMEM_SECURE_VIDEO_RECORD Secure Video Recording > > + */ > > +#define OPTEE_MSG_PROTMEM_RESERVED 0 > > +#define OPTEE_MSG_PROTMEM_SECURE_VIDEO_PLAY 1 > > +#define OPTEE_MSG_PROTMEM_TRUSTED_UI 2 > > +#define OPTEE_MSG_PROTMEM_SECURE_VIDEO_RECORD 3 > > + > > /* > > * Do a secure call with struct optee_msg_arg as argument > > * The OPTEE_MSG_CMD_* below defines what goes in struct optee_msg_arg::cmd > > @@ -337,15 +349,62 @@ struct optee_msg_arg { > > * OPTEE_MSG_CMD_STOP_ASYNC_NOTIF informs secure world that from now is > > * normal world unable to process asynchronous notifications. Typically > > * used when the driver is shut down. > > + * > > + * OPTEE_MSG_CMD_LEND_PROTMEM lends protected memory. The passed normal > > + * physical memory is protected from normal world access. The memory > > + * should be unmapped prior to this call since it becomes inaccessible > > + * during the request. > > + * Parameters are passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a OPTEE_MSG_PROTMEM_* defined above > > + * [in] param[1].attr OPTEE_MSG_ATTR_TYPE_TMEM_INPUT > > + * [in] param[1].u.tmem.buf_ptr physical address > > + * [in] param[1].u.tmem.size size > > + * [in] param[1].u.tmem.shm_ref holds protected memory reference > > + * > > + * OPTEE_MSG_CMD_RECLAIM_PROTMEM reclaims a previously lent protected > > + * memory reference. The physical memory is accessible by the normal world > > + * after this function has return and can be mapped again. The information > > + * is passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a holds protected memory cookie > > + * > > + * OPTEE_MSG_CMD_GET_PROTMEM_CONFIG get configuration for a specific > > + * protected memory use case. Parameters are passed as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INOUT > > + * [in] param[0].value.a OPTEE_MSG_PROTMEM_* > > + * [in] param[1].attr OPTEE_MSG_ATTR_TYPE_{R,F}MEM_OUTPUT > > + * [in] param[1].u.{r,f}mem Buffer or NULL > > + * [in] param[1].u.{r,f}mem.size Provided size of buffer or 0 for query > > + * output for the protected use case: > > + * [out] param[0].value.a Minimal size of protected memory > > + * [out] param[0].value.b Required alignment of size and start of > > + * protected memory > > + * [out] param[1].{r,f}mem.size Size of output data > > + * [out] param[1].{r,f}mem If non-NULL, contains an array of > > + * uint16_t holding endpoints that > > + * must be included when lending > > + * memory for this use case > > + * > > + * OPTEE_MSG_CMD_ASSIGN_PROTMEM assigns use-case to protected memory > > + * previously lent using the FFA_LEND framework ABI. Parameters are passed > > + * as: > > + * [in] param[0].attr OPTEE_MSG_ATTR_TYPE_VALUE_INPUT > > + * [in] param[0].u.value.a holds protected memory cookie > > + * [in] param[0].u.value.b OPTEE_MSG_PROTMEM_* defined above > > */ > > -#define OPTEE_MSG_CMD_OPEN_SESSION 0 > > -#define OPTEE_MSG_CMD_INVOKE_COMMAND 1 > > -#define OPTEE_MSG_CMD_CLOSE_SESSION 2 > > -#define OPTEE_MSG_CMD_CANCEL 3 > > -#define OPTEE_MSG_CMD_REGISTER_SHM 4 > > -#define OPTEE_MSG_CMD_UNREGISTER_SHM 5 > > -#define OPTEE_MSG_CMD_DO_BOTTOM_HALF 6 > > -#define OPTEE_MSG_CMD_STOP_ASYNC_NOTIF 7 > > -#define OPTEE_MSG_FUNCID_CALL_WITH_ARG 0x0004 > > +#define OPTEE_MSG_CMD_OPEN_SESSION 0 > > +#define OPTEE_MSG_CMD_INVOKE_COMMAND 1 > > +#define OPTEE_MSG_CMD_CLOSE_SESSION 2 > > +#define OPTEE_MSG_CMD_CANCEL 3 > > +#define OPTEE_MSG_CMD_REGISTER_SHM 4 > > +#define OPTEE_MSG_CMD_UNREGISTER_SHM 5 > > +#define OPTEE_MSG_CMD_DO_BOTTOM_HALF 6 > > +#define OPTEE_MSG_CMD_STOP_ASYNC_NOTIF 7 > > +#define OPTEE_MSG_CMD_LEND_PROTMEM 8 > > +#define OPTEE_MSG_CMD_RECLAIM_PROTMEM 9 > > +#define OPTEE_MSG_CMD_GET_PROTMEM_CONFIG 10 > > +#define OPTEE_MSG_CMD_ASSIGN_PROTMEM 11 > > +#define OPTEE_MSG_FUNCID_CALL_WITH_ARG 0x0004 > > > > #endif /* _OPTEE_MSG_H */ > > diff --git a/drivers/tee/optee/optee_smc.h b/drivers/tee/optee/optee_smc.h > > index 879426300821..b17e81f464a3 100644 > > --- a/drivers/tee/optee/optee_smc.h > > +++ b/drivers/tee/optee/optee_smc.h > > @@ -264,7 +264,6 @@ struct optee_smc_get_shm_config_result { > > #define OPTEE_SMC_SEC_CAP_HAVE_RESERVED_SHM BIT(0) > > /* Secure world can communicate via previously unregistered shared memory */ > > #define OPTEE_SMC_SEC_CAP_UNREGISTERED_SHM BIT(1) > > - > > /* > > * Secure world supports commands "register/unregister shared memory", > > * secure world accepts command buffers located in any parts of non-secure RAM > > @@ -280,6 +279,10 @@ struct optee_smc_get_shm_config_result { > > #define OPTEE_SMC_SEC_CAP_RPC_ARG BIT(6) > > /* Secure world supports probing for RPMB device if needed */ > > #define OPTEE_SMC_SEC_CAP_RPMB_PROBE BIT(7) > > +/* Secure world supports protected memory */ > > +#define OPTEE_SMC_SEC_CAP_PROTMEM BIT(8) > > +/* Secure world supports dynamic protected memory */ > > +#define OPTEE_SMC_SEC_CAP_DYNAMIC_PROTMEM BIT(9) > > > > #define OPTEE_SMC_FUNCID_EXCHANGE_CAPABILITIES 9 > > #define OPTEE_SMC_EXCHANGE_CAPABILITIES \ > > @@ -451,6 +454,72 @@ struct optee_smc_disable_shm_cache_result { > > > > /* See OPTEE_SMC_CALL_WITH_REGD_ARG above */ > > #define OPTEE_SMC_FUNCID_CALL_WITH_REGD_ARG 19 > > +/* > > + * Get protected memory config > > + * > > + * Returns the protected memory config. > > + * > > + * Call register usage: > > + * a0 SMC Function ID, OPTEE_SMC_GET_PROTMEM_CONFIG > > + * a2-6 Not used, must be zero > > + * a7 Hypervisor Client ID register > > + * > > + * Have config return register usage: > > + * a0 OPTEE_SMC_RETURN_OK > > + * a1 Physical address of start of protected memory > > + * a2 Size of protected memory > > + * a3 Not used > > + * a4-7 Preserved > > + * > > + * Not available register usage: > > + * a0 OPTEE_SMC_RETURN_ENOTAVAIL > > + * a1-3 Not used > > + * a4-7 Preserved > > + */ > > +#define OPTEE_SMC_FUNCID_GET_PROTMEM_CONFIG 20 > > +#define OPTEE_SMC_GET_PROTMEM_CONFIG \ > > + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_GET_PROTMEM_CONFIG) > > + > > +struct optee_smc_get_protmem_config_result { > > + unsigned long status; > > + unsigned long start; > > + unsigned long size; > > + unsigned long flags; > > The ABI comment does not document a flags return argument, either > this can be removed or the ABI comment needs to be fixed. Sure, I'll remove the field. > Same for > > +}; > > + > > +/* > > + * Get dynamic protected memory config > > + * > > + * Returns the dynamic protected memory config. > > + * > > + * Call register usage: > > + * a0 SMC Function ID, OPTEE_SMC_GET_DYN_SHM_CONFIG > > should be OPTEE_SMC_GET_DYN_PROTMEM_CONFIG Thanks, I'll update. > > > + * a2-6 Not used, must be zero > > + * a7 Hypervisor Client ID register > > + * > > + * Have config return register usage: > > + * a0 OPTEE_SMC_RETURN_OK > > + * a1 Minamal size of protected memory > > Nit: Typo, should be "Minimal" Yes, I'll update. Cheers, Jens > > > + * a2 Required alignment of size and start of registered protected memory > > + * a3 Not used > > + * a4-7 Preserved > > + * > > + * Not available register usage: > > + * a0 OPTEE_SMC_RETURN_ENOTAVAIL > > + * a1-3 Not used > > + * a4-7 Preserved > > + */ > > + > > +#define OPTEE_SMC_FUNCID_GET_DYN_PROTMEM_CONFIG 21 > > +#define OPTEE_SMC_GET_DYN_PROTMEM_CONFIG \ > > + OPTEE_SMC_FAST_CALL_VAL(OPTEE_SMC_FUNCID_GET_DYN_PROTMEM_CONFIG) > > + > > +struct optee_smc_get_dyn_protmem_config_result { > > + unsigned long status; > > + unsigned long size; > > + unsigned long align; > > + unsigned long flags; > > +}; > > > > /* > > * Resume from RPC (for example after processing a foreign interrupt) > > -- > > 2.43.0 > > - Rouven

9 months, 3 weeks

1
0
0 0

[PATCH AUTOSEL 6.13 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

by Sasha Levin

From: Christian König <christian.koenig(a)amd.com> [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] Try pinning into VRAM to allow P2P with RDMA NICs without ODP support if all attachments can do P2P. If any attachment can't do P2P just pin into GTT instead. Acked-by: Simona Vetter <simona.vetter(a)ffwll.ch> Signed-off-by: Christian König <christian.koenig(a)amd.com> Signed-off-by: Felix Kuehling <felix.kuehling(a)amd.com> Reviewed-by: Felix Kuehling <felix.kuehling(a)amd.com> Tested-by: Pak Nin Lui <pak.lui(a)amd.com> Cc: Simona Vetter <simona.vetter(a)ffwll.ch> Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 25 +++++++++++++++------ 1 file changed, 18 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c index 8e81a83d37d84..83390143c2e9f 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c @@ -72,11 +72,25 @@ static int amdgpu_dma_buf_attach(struct dma_buf *dmabuf, */ static int amdgpu_dma_buf_pin(struct dma_buf_attachment *attach) { - struct drm_gem_object *obj = attach->dmabuf->priv; - struct amdgpu_bo *bo = gem_to_amdgpu_bo(obj); + struct dma_buf *dmabuf = attach->dmabuf; + struct amdgpu_bo *bo = gem_to_amdgpu_bo(dmabuf->priv); + u32 domains = bo->preferred_domains; - /* pin buffer into GTT */ - return amdgpu_bo_pin(bo, AMDGPU_GEM_DOMAIN_GTT); + dma_resv_assert_held(dmabuf->resv); + + /* + * Try pinning into VRAM to allow P2P with RDMA NICs without ODP + * support if all attachments can do P2P. If any attachment can't do + * P2P just pin into GTT instead. + */ + list_for_each_entry(attach, &dmabuf->attachments, node) + if (!attach->peer2peer) + domains &= ~AMDGPU_GEM_DOMAIN_VRAM; + + if (domains & AMDGPU_GEM_DOMAIN_VRAM) + bo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + + return amdgpu_bo_pin(bo, domains); } /** @@ -131,9 +145,6 @@ static struct sg_table *amdgpu_dma_buf_map(struct dma_buf_attachment *attach, r = ttm_bo_validate(&bo->tbo, &bo->placement, &ctx); if (r) return ERR_PTR(r); - - } else if (bo->tbo.resource->mem_type != TTM_PL_TT) { - return ERR_PTR(-EBUSY); } switch (bo->tbo.resource->mem_type) { -- 2.39.5

9 months, 3 weeks

2
2
0 0

[PATCH v3 0/2] dma-buf: heaps: Support carved-out heaps

by Maxime Ripard

Hi, This series is the follow-up of the discussion that John and I had some time ago here: https://lore.kernel.org/all/CANDhNCquJn6bH3KxKf65BWiTYLVqSd9892-xtFDHHqqyrr… The initial problem we were discussing was that I'm currently working on a platform which has a memory layout with ECC enabled. However, enabling the ECC has a number of drawbacks on that platform: lower performance, increased memory usage, etc. So for things like framebuffers, the trade-off isn't great and thus there's a memory region with ECC disabled to allocate from for such use cases. After a suggestion from John, I chose to first start using heap allocations flags to allow for userspace to ask for a particular ECC setup. This is then backed by a new heap type that runs from reserved memory chunks flagged as such, and the existing DT properties to specify the ECC properties. After further discussion, it was considered that flags were not the right solution, and relying on the names of the heaps would be enough to let userspace know the kind of buffer it deals with. Thus, even though the uAPI part of it has been dropped in this second version, we still need a driver to create heaps out of carved-out memory regions. In addition to the original usecase, a similar driver can be found in BSPs from most vendors, so I believe it would be a useful addition to the kernel. I submitted a draft PR to the DT schema for the bindings used in this PR: https://github.com/devicetree-org/dt-schema/pull/138 Let me know what you think, Maxime Signed-off-by: Maxime Ripard <mripard(a)kernel.org> --- Changes in v3: - Reworked global variable patch - Link to v2: https://lore.kernel.org/r/20250401-dma-buf-ecc-heap-v2-0-043fd006a1af@kerne… Changes in v2: - Add vmap/vunmap operations - Drop ECC flags uapi - Rebase on top of 6.14 - Link to v1: https://lore.kernel.org/r/20240515-dma-buf-ecc-heap-v1-0-54cbbd049511@kerne… --- Maxime Ripard (2): dma-buf: heaps: system: Remove global variable dma-buf: heaps: Introduce a new heap for reserved memory drivers/dma-buf/heaps/Kconfig | 8 + drivers/dma-buf/heaps/Makefile | 1 + drivers/dma-buf/heaps/carveout_heap.c | 360 ++++++++++++++++++++++++++++++++++ drivers/dma-buf/heaps/system_heap.c | 3 +- 4 files changed, 370 insertions(+), 2 deletions(-) --- base-commit: fcbf30774e82a441890b722bf0c26542fb82150f change-id: 20240515-dma-buf-ecc-heap-28a311d2c94e Best regards, -- Maxime Ripard <mripard(a)kernel.org>

9 months, 3 weeks

4
7
0 0

Re: [PATCH 2/4] bpf: Add dmabuf iterator

by T.J. Mercier

On Tue, Apr 22, 2025 at 4:01 PM Alexei Starovoitov <alexei.starovoitov(a)gmail.com> wrote: > > On Tue, Apr 22, 2025 at 12:57 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > On Mon, Apr 21, 2025 at 4:39 PM Alexei Starovoitov > > <alexei.starovoitov(a)gmail.com> wrote: > > > > > > On Mon, Apr 21, 2025 at 1:40 PM T.J. Mercier <tjmercier(a)google.com> wrote: > > > > > > > > > > new file mode 100644 > > > > > > index 000000000000..b4b8be1d6aa4 > > > > > > --- /dev/null > > > > > > +++ b/kernel/bpf/dmabuf_iter.c > > > > > > > > > > Maybe we should add this file to drivers/dma-buf. I would like to > > > > > hear other folks thoughts on this. > > > > > > > > This is fine with me, and would save us the extra > > > > CONFIG_DMA_SHARED_BUFFER check that's currently needed in > > > > kernel/bpf/Makefile but would require checking CONFIG_BPF instead. > > > > Sumit / Christian any objections to moving the dmabuf bpf iterator > > > > implementation into drivers/dma-buf? > > > > > > The driver directory would need to 'depends on BPF_SYSCALL'. > > > Are you sure you want this? > > > imo kernel/bpf/ is fine for this. > > > > I don't have a strong preference so either way is fine with me. The > > main difference I see is maintainership. > > > > > You also probably want > > > .feature = BPF_ITER_RESCHED > > > in bpf_dmabuf_reg_info. > > > > Thank you, this looks like a good idea. > > > > > Also have you considered open coded iterator for dmabufs? > > > Would it help with the interface to user space? > > > > I read through the open coded iterator patches, and it looks like they > > would be slightly more efficient by avoiding seq_file overhead. As far > > as the interface to userspace, for the purpose of replacing what's > > currently exposed by CONFIG_DMABUF_SYSFS_STATS I don't think there is > > a difference. However it looks like if I were to try to replace all of > > our userspace analysis of dmabufs with a single bpf program then an > > open coded iterator would make that much easier. I had not considered > > attempting that. > > > > One problem I see with open coded iterators is that support is much > > more recent (2023 vs 2020). We support longterm stable kernels (back > > to 5.4 currently but probably 5.10 by the time this would be used), so > > it seems like it would be harder to backport the kernel support for an > > open-coded iterator that far since it only goes back as far as 6.6 > > now. Actually it doesn't look like it is possible while also > > maintaining the stable ABI we provide to device vendors. Which means > > we couldn't get rid of the dmabuf sysfs stats userspace dependency > > until 6.1 EOL in Dec. 2027. :\ So I'm in favor of a traditional bpf > > iterator here for now. > > Fair enough, but please implement both and backport only > the old style pinned iterator. Ok, will do. > The code will be mostly shared between them. > bpf_iter_dmabuf_new/_next will be more flexible with more > options to return data to user space. Like android can invent > their own binary format. Pack into it in a bpf prog, send to > bpf ringbuf and unmarshal efficiently in user space. > Instead of being limited to text output that pinned iterators > are supposed to do usually. Also a neat idea! > You can do binary with bpf_seq_write() too, but it's rare. > > Also please provide full bpf prog that you'll use in production > in a selftest instead of trivial: > +SEC("iter/dmabuf") > +int dmabuf_collector(struct bpf_iter__dmabuf *ctx) > > just to make sure it's tested end to end and future changes > won't break it. The final bpf program should be something pretty close to that, but I'll start working on the AOSP side as well so I can put up patches. > > pw-bot: cr

9 months, 3 weeks

1
0
0 0

Re: [PATCH v2 0/2] dma-buf: heaps: Use constant name for CMA heap

by Sumit Semwal

Hello Jared, On Wed, 23 Apr 2025 at 00:49, Jared Kangas <jkangas(a)redhat.com> wrote: > > Hi all, > > This patch series is based on a previous discussion around CMA heap > naming. [1] The heap's name depends on the device name, which is > generally "reserved", "linux,cma", or "default-pool", but could be any > arbitrary name given to the default CMA area in the devicetree. For a > consistent userspace interface, the series introduces a constant name > for the CMA heap, and for backwards compatibility, an additional Kconfig > that controls the creation of a legacy-named heap with the same CMA > backing. > > The ideas to handle backwards compatibility in [1] are to either use a > symlink or add a heap node with a duplicate minor. However, I assume > that we don't want to create symlinks in /dev from module initcalls, and > attempting to duplicate minors would cause device_create() to fail. > Because of these drawbacks, after brainstorming with Maxime Ripard, I > went with creating a new node in devtmpfs with its own minor. This > admittedly makes it a little unclear that the old and new nodes are > backed by the same heap when both are present. The only approach that I > think would provide total clarity on this in userspace is symlinking, > which seemed like a fairly involved solution for devtmpfs, but if I'm > wrong on this, please let me know. Thanks indeed for this patch; just one minor nit: the link referred to as [1] here seems to be missing. Could you please add it? This would make it easier to follow the chain of discussion in posterity. > > Changelog: > v2: Use tabs instead of spaces for large vertical alignment. > > Jared Kangas (2): > dma-buf: heaps: Parameterize heap name in __add_cma_heap() > dma-buf: heaps: Give default CMA heap a fixed name > > Documentation/userspace-api/dma-buf-heaps.rst | 11 ++++--- > drivers/dma-buf/heaps/Kconfig | 10 +++++++ > drivers/dma-buf/heaps/cma_heap.c | 30 ++++++++++++++----- > 3 files changed, 40 insertions(+), 11 deletions(-) > > -- > 2.49.0 > Best, Sumit

9 months, 3 weeks

1
0
0 0

Re: [PATCH v2 2/2] dma-buf: heaps: Give default CMA heap a fixed name

by John Stultz

On Tue, Apr 22, 2025 at 12:19 PM Jared Kangas <jkangas(a)redhat.com> wrote: > > The CMA heap's name in devtmpfs can vary depending on how the heap is > defined. Its name defaults to "reserved", but if a CMA area is defined > in the devicetree, the heap takes on the devicetree node's name, such as > "default-pool" or "linux,cma". To simplify naming, just name it > "default_cma", and keep a legacy node in place backed by the same > underlying structure for backwards compatibility. > > Signed-off-by: Jared Kangas <jkangas(a)redhat.com> Once again, thanks for working out how to improve the standard naming while keeping compatibility. I do still hope we can get to the point where other cma regions can be registered as heaps with unique/purpose-specific names, but I can see having a standard name for the default region is a nice improvement. Acked-by: John Stultz <jstultz(a)google.com> thanks -john

9 months, 3 weeks

1
0
0 0

Re: [PATCH v2 1/2] dma-buf: heaps: Parameterize heap name in __add_cma_heap()

by John Stultz

On Tue, Apr 22, 2025 at 12:19 PM Jared Kangas <jkangas(a)redhat.com> wrote: > > Prepare for the introduction of a fixed-name CMA heap by replacing the > unused void pointer parameter in __add_cma_heap() with the heap name. > > Signed-off-by: Jared Kangas <jkangas(a)redhat.com> Thanks so much for taking this effort on. Looks good to me! Acked-by: John Stultz <jstultz(a)google.com>

9 months, 3 weeks

1
0
0 0

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Linaro-mm-sig