The drm/ttm patch modifies TTM to support multiple contexts for the pipelined moves.
Then amdgpu/ttm is updated to express dependencies between jobs explicitely, instead of relying on the ordering of execution guaranteed by the use of a single instance. With all of this in place, we can use multiple entities, with each having access to the available SDMA instances.
This rework also gives the opportunity to merge the clear functions into a single one and to optimize a bit GART usage.
Since v3 some patches have been already reviewed and merged separately: - https://lists.freedesktop.org/archives/amd-gfx/2026-January/137747.html - https://gitlab.freedesktop.org/drm/kernel/-/commit/ddf055b80a544d6f36f77be5f... This version depend on them.
v3: https://lists.freedesktop.org/archives/dri-devel/2025-November/537830.html
Pierre-Eric Pelloux-Prayer (12): drm/amdgpu: allocate clear entities dynamically drm/amdgpu: allocate move entities dynamically drm/amdgpu: round robin through clear_entities in amdgpu_fill_buffer drm/amdgpu: use TTM_NUM_MOVE_FENCES when reserving fences drm/amdgpu: use multiple entities in amdgpu_move_blit drm/amdgpu: pass all the sdma scheds to amdgpu_mman drm/amdgpu: only use working sdma schedulers for ttm drm/amdgpu: create multiple clear/move ttm entities drm/amdgpu: give ttm entities access to all the sdma scheds drm/amdgpu: get rid of amdgpu_ttm_clear_buffer drm/amdgpu: rename amdgpu_fill_buffer as amdgpu_ttm_clear_buffer drm/amdgpu: split amdgpu_ttm_set_buffer_funcs_status in 2 funcs
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 17 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 329 ++++++++++-------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 29 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 6 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 13 +- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 8 +- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 15 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 12 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 11 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 14 +- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 +- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 5 +- drivers/gpu/drm/amd/amdgpu/sdma_v7_1.c | 12 +- drivers/gpu/drm/amd/amdgpu/si_dma.c | 12 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 +- .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 6 +- .../drm/amd/display/amdgpu_dm/amdgpu_dm_wb.c | 6 +- 23 files changed, 300 insertions(+), 243 deletions(-)
It's doing the same thing as amdgpu_fill_buffer(src_data=0), so drop it.
The only caveat is that amdgpu_res_cleared() return value is only valid right after allocation.
--- v2: introduce new "bool consider_clear_status" arg ---
Signed-off-by: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 16 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 88 +++++----------------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 6 +- 3 files changed, 32 insertions(+), 78 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index 66c20dd46d12..d0884bbffa75 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -717,13 +717,17 @@ int amdgpu_bo_create(struct amdgpu_device *adev, bo->tbo.resource->mem_type == TTM_PL_VRAM) { struct dma_fence *fence;
- r = amdgpu_ttm_clear_buffer(bo, bo->tbo.base.resv, &fence); + r = amdgpu_fill_buffer(amdgpu_ttm_next_clear_entity(adev), + bo, 0, NULL, &fence, + true, AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); if (unlikely(r)) goto fail_unreserve;
- dma_resv_add_fence(bo->tbo.base.resv, fence, - DMA_RESV_USAGE_KERNEL); - dma_fence_put(fence); + if (fence) { + dma_resv_add_fence(bo->tbo.base.resv, fence, + DMA_RESV_USAGE_KERNEL); + dma_fence_put(fence); + } } if (!bp->resv) amdgpu_bo_unreserve(bo); @@ -1326,8 +1330,8 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) goto out;
r = amdgpu_fill_buffer(amdgpu_ttm_next_clear_entity(adev), - abo, 0, &bo->base._resv, - &fence, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); + abo, 0, &bo->base._resv, &fence, + false, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); if (WARN_ON(r)) goto out;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index f4304f061d7e..b7124356dd26 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -418,7 +418,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, (abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) { struct dma_fence *wipe_fence = NULL; r = amdgpu_fill_buffer(entity, abo, 0, NULL, &wipe_fence, - AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); + false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); if (r) { goto error; } else if (wipe_fence) { @@ -2582,76 +2582,25 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_device *adev, }
/** - * amdgpu_ttm_clear_buffer - clear memory buffers - * @bo: amdgpu buffer object - * @resv: reservation object - * @fence: dma_fence associated with the operation + * amdgpu_fill_buffer - fill a buffer with a given value + * @entity: entity to use + * @bo: the bo to fill + * @src_data: the value to set + * @resv: fences contained in this reservation will be used as dependencies. + * @out_fence: the fence from the last clear will be stored here. It might be + * NULL if no job was run. + * @dependency: optional input dependency fence. + * @consider_clear_status: true if region reported as cleared by amdgpu_res_cleared() + * are skipped. + * @k_job_id: trace id * - * Clear the memory buffer resource. - * - * Returns: - * 0 for success or a negative error code on failure. */ -int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, - struct dma_resv *resv, - struct dma_fence **fence) -{ - struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); - struct amdgpu_ttm_buffer_entity *entity; - struct amdgpu_res_cursor cursor; - u64 addr; - int r = 0; - - if (!adev->mman.buffer_funcs_enabled) - return -EINVAL; - - if (!fence) - return -EINVAL; - entity = &adev->mman.clear_entities[0]; - *fence = dma_fence_get_stub(); - - amdgpu_res_first(bo->tbo.resource, 0, amdgpu_bo_size(bo), &cursor); - - mutex_lock(&entity->lock); - while (cursor.remaining) { - struct dma_fence *next = NULL; - u64 size; - - if (amdgpu_res_cleared(&cursor)) { - amdgpu_res_next(&cursor, cursor.size); - continue; - } - - /* Never clear more than 256MiB at once to avoid timeouts */ - size = min(cursor.size, 256ULL << 20); - - r = amdgpu_ttm_map_buffer(entity, &bo->tbo, bo->tbo.resource, &cursor, - 0, false, &size, &addr); - if (r) - goto err; - - r = amdgpu_ttm_fill_mem(adev, entity, 0, addr, size, resv, - &next, true, - AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); - if (r) - goto err; - - dma_fence_put(*fence); - *fence = next; - - amdgpu_res_next(&cursor, size); - } -err: - mutex_unlock(&entity->lock); - - return r; -} - int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, struct amdgpu_bo *bo, uint32_t src_data, struct dma_resv *resv, - struct dma_fence **f, + struct dma_fence **out_fence, + bool consider_clear_status, u64 k_job_id) { struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); @@ -2669,6 +2618,11 @@ int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, struct dma_fence *next; uint64_t cur_size, to;
+ if (consider_clear_status && amdgpu_res_cleared(&dst)) { + amdgpu_res_next(&dst, dst.size); + continue; + } + /* Never fill more than 256MiB at once to avoid timeouts */ cur_size = min(dst.size, 256ULL << 20);
@@ -2690,9 +2644,7 @@ int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, } error: mutex_unlock(&entity->lock); - if (f) - *f = dma_fence_get(fence); - dma_fence_put(fence); + *out_fence = fence; return r; }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index a6249252948b..436a3e09a178 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -187,14 +187,12 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, struct dma_resv *resv, struct dma_fence **fence, bool vm_needs_flush, uint32_t copy_flags); -int amdgpu_ttm_clear_buffer(struct amdgpu_bo *bo, - struct dma_resv *resv, - struct dma_fence **fence); int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, struct amdgpu_bo *bo, uint32_t src_data, struct dma_resv *resv, - struct dma_fence **f, + struct dma_fence **out_fence, + bool consider_clear_status, u64 k_job_id); struct amdgpu_ttm_buffer_entity *amdgpu_ttm_next_clear_entity(struct amdgpu_device *adev);
This is the only use case for this function.
--- v2: amdgpu_ttm_clear_buffer instead of amdgpu_clear_buffer ---
Signed-off-by: Pierre-Eric Pelloux-Prayer pierre-eric.pelloux-prayer@amd.com Reviewed-by: Christian König christian.koenig@amd.com --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 12 +++++------ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 23 ++++++++++------------ drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 13 ++++++------ 3 files changed, 22 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c index d0884bbffa75..195cb1c814d1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c @@ -717,9 +717,9 @@ int amdgpu_bo_create(struct amdgpu_device *adev, bo->tbo.resource->mem_type == TTM_PL_VRAM) { struct dma_fence *fence;
- r = amdgpu_fill_buffer(amdgpu_ttm_next_clear_entity(adev), - bo, 0, NULL, &fence, - true, AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); + r = amdgpu_ttm_clear_buffer(amdgpu_ttm_next_clear_entity(adev), + bo, NULL, &fence, + true, AMDGPU_KERNEL_JOB_ID_TTM_CLEAR_BUFFER); if (unlikely(r)) goto fail_unreserve;
@@ -1329,9 +1329,9 @@ void amdgpu_bo_release_notify(struct ttm_buffer_object *bo) if (r) goto out;
- r = amdgpu_fill_buffer(amdgpu_ttm_next_clear_entity(adev), - abo, 0, &bo->base._resv, &fence, - false, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); + r = amdgpu_ttm_clear_buffer(amdgpu_ttm_next_clear_entity(adev), + abo, &bo->base._resv, &fence, + false, AMDGPU_KERNEL_JOB_ID_CLEAR_ON_RELEASE); if (WARN_ON(r)) goto out;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index b7124356dd26..3b369b3fbce8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -417,8 +417,8 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo, if (old_mem->mem_type == TTM_PL_VRAM && (abo->flags & AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE)) { struct dma_fence *wipe_fence = NULL; - r = amdgpu_fill_buffer(entity, abo, 0, NULL, &wipe_fence, - false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); + r = amdgpu_ttm_clear_buffer(entity, abo, NULL, &wipe_fence, + false, AMDGPU_KERNEL_JOB_ID_MOVE_BLIT); if (r) { goto error; } else if (wipe_fence) { @@ -2582,26 +2582,23 @@ static int amdgpu_ttm_fill_mem(struct amdgpu_device *adev, }
/** - * amdgpu_fill_buffer - fill a buffer with a given value + * amdgpu_ttm_clear_buffer - fill a buffer with 0 * @entity: entity to use * @bo: the bo to fill - * @src_data: the value to set * @resv: fences contained in this reservation will be used as dependencies. * @out_fence: the fence from the last clear will be stored here. It might be * NULL if no job was run. - * @dependency: optional input dependency fence. * @consider_clear_status: true if region reported as cleared by amdgpu_res_cleared() * are skipped. * @k_job_id: trace id * */ -int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, - struct amdgpu_bo *bo, - uint32_t src_data, - struct dma_resv *resv, - struct dma_fence **out_fence, - bool consider_clear_status, - u64 k_job_id) +int amdgpu_ttm_clear_buffer(struct amdgpu_ttm_buffer_entity *entity, + struct amdgpu_bo *bo, + struct dma_resv *resv, + struct dma_fence **out_fence, + bool consider_clear_status, + u64 k_job_id) { struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev); struct dma_fence *fence = NULL; @@ -2632,7 +2629,7 @@ int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, goto error;
r = amdgpu_ttm_fill_mem(adev, entity, - src_data, to, cur_size, resv, + 0, to, cur_size, resv, &next, true, k_job_id); if (r) goto error; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h index 436a3e09a178..d7b14d5cac77 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h @@ -187,13 +187,12 @@ int amdgpu_copy_buffer(struct amdgpu_device *adev, struct dma_resv *resv, struct dma_fence **fence, bool vm_needs_flush, uint32_t copy_flags); -int amdgpu_fill_buffer(struct amdgpu_ttm_buffer_entity *entity, - struct amdgpu_bo *bo, - uint32_t src_data, - struct dma_resv *resv, - struct dma_fence **out_fence, - bool consider_clear_status, - u64 k_job_id); +int amdgpu_ttm_clear_buffer(struct amdgpu_ttm_buffer_entity *entity, + struct amdgpu_bo *bo, + struct dma_resv *resv, + struct dma_fence **out_fence, + bool consider_clear_status, + u64 k_job_id); struct amdgpu_ttm_buffer_entity *amdgpu_ttm_next_clear_entity(struct amdgpu_device *adev);
int amdgpu_ttm_alloc_gart(struct ttm_buffer_object *bo);
Finally coming back to this patch set here.
Fell free to add Reviewed-by: Christian König christian.koenig@amd.com to the first two patches as well and then please start pushing the patches to amd-staging-drm-next.
I probably need to go over the last patches once more, but I think it would be better to have the first few upstream first.
Regards, Christian.
On 2/3/26 11:22, Pierre-Eric Pelloux-Prayer wrote:
The drm/ttm patch modifies TTM to support multiple contexts for the pipelined moves.
Then amdgpu/ttm is updated to express dependencies between jobs explicitely, instead of relying on the ordering of execution guaranteed by the use of a single instance. With all of this in place, we can use multiple entities, with each having access to the available SDMA instances.
This rework also gives the opportunity to merge the clear functions into a single one and to optimize a bit GART usage.
Since v3 some patches have been already reviewed and merged separately:
- https://lists.freedesktop.org/archives/amd-gfx/2026-January/137747.html
- https://gitlab.freedesktop.org/drm/kernel/-/commit/ddf055b80a544d6f36f77be5f...
This version depend on them.
v3: https://lists.freedesktop.org/archives/dri-devel/2025-November/537830.html
Pierre-Eric Pelloux-Prayer (12): drm/amdgpu: allocate clear entities dynamically drm/amdgpu: allocate move entities dynamically drm/amdgpu: round robin through clear_entities in amdgpu_fill_buffer drm/amdgpu: use TTM_NUM_MOVE_FENCES when reserving fences drm/amdgpu: use multiple entities in amdgpu_move_blit drm/amdgpu: pass all the sdma scheds to amdgpu_mman drm/amdgpu: only use working sdma schedulers for ttm drm/amdgpu: create multiple clear/move ttm entities drm/amdgpu: give ttm entities access to all the sdma scheds drm/amdgpu: get rid of amdgpu_ttm_clear_buffer drm/amdgpu: rename amdgpu_fill_buffer as amdgpu_ttm_clear_buffer drm/amdgpu: split amdgpu_ttm_set_buffer_funcs_status in 2 funcs
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 5 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 17 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 329 ++++++++++-------- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 29 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 6 +- drivers/gpu/drm/amd/amdgpu/cik_sdma.c | 13 +- drivers/gpu/drm/amd/amdgpu/sdma_v2_4.c | 8 +- drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c | 8 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c | 15 +- drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 12 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c | 11 +- drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c | 14 +- drivers/gpu/drm/amd/amdgpu/sdma_v6_0.c | 5 +- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 5 +- drivers/gpu/drm/amd/amdgpu/sdma_v7_1.c | 12 +- drivers/gpu/drm/amd/amdgpu/si_dma.c | 12 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 5 +- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 +- .../amd/display/amdgpu_dm/amdgpu_dm_plane.c | 6 +- .../drm/amd/display/amdgpu_dm/amdgpu_dm_wb.c | 6 +- 23 files changed, 300 insertions(+), 243 deletions(-)
linaro-mm-sig@lists.linaro.org