The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl() holds three cleanup-tracked resources before calling kvcalloc(): the drm_gem_object reference from drm_gem_object_lookup(), the drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and the drm_exec lock on the per-process VM root page directory via amdgpu_vm_lock_pd(). All three are released by the out_exec label that every other error path in this function jumps to. The kvcalloc() failure path returns -ENOMEM directly, skipping out_exec and leaking all three.
The leaked per-process VM root PD dma_resv lock is the load-bearing leak: any subsequent operation on the same VM (further GEM ops, command-submission, eviction, TTM shrinker callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local denial of service against the caller's GPU context, reachable by any process with /dev/dri/renderD* access.
Route the failure through out_exec so drm_exec_fini() and drm_gem_object_put() run.
Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega (Lucienne): the failing ioctl returns -ENOMEM and a second GET_MAPPING_INFO on the same fd then blocks in drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the caller does not reap the task; the fd-release path during process exit goes through amdgpu_gem_object_close() -> drm_exec_prepare_obj() on the same lock, leaving the task in D state until the box is rebooted. The patched kernel was not rebuilt and re-tested on this hardware; the fix is mechanical. Tested on a single Lucienne / Vega box only.
Ziyi Guo posted an independent INT_MAX-bound check for args->num_entries in the same branch [1]; the two patches are complementary and can land in either order.
Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.e... # [1] Signed-off-by: Michael Bommarito michael.bommarito@gmail.com Assisted-by: Claude:claude-opus-4-7 --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 9ef80bca4102..8224fb499fdf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -1091,8 +1091,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data, * be retried. */ vm_entries = kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL); - if (!vm_entries) - return -ENOMEM; + if (!vm_entries) { + r = -ENOMEM; + goto out_exec; + }
amdgpu_vm_bo_va_for_each_valid_mapping(bo_va, mapping) { if (num_mappings < args->num_entries) {
Applied. Thanks!
Alex
On Sun, May 17, 2026 at 9:24 AM Michael Bommarito michael.bommarito@gmail.com wrote:
The AMDGPU_GEM_OP_GET_MAPPING_INFO branch of amdgpu_gem_op_ioctl() holds three cleanup-tracked resources before calling kvcalloc(): the drm_gem_object reference from drm_gem_object_lookup(), the drm_exec lock on the looked-up GEM via drm_exec_lock_obj(), and the drm_exec lock on the per-process VM root page directory via amdgpu_vm_lock_pd(). All three are released by the out_exec label that every other error path in this function jumps to. The kvcalloc() failure path returns -ENOMEM directly, skipping out_exec and leaking all three.
The leaked per-process VM root PD dma_resv lock is the load-bearing leak: any subsequent operation on the same VM (further GEM ops, command-submission, eviction, TTM shrinker callbacks) blocks on the held lock. DRM_IOCTL_AMDGPU_GEM_OP is DRM_AUTH | DRM_RENDER_ALLOW, so this is an unprivileged-local denial of service against the caller's GPU context, reachable by any process with /dev/dri/renderD* access.
Route the failure through out_exec so drm_exec_fini() and drm_gem_object_put() run.
Reproduced on stock 7.0.0-10, Ryzen 7 5700U / Radeon Vega (Lucienne): the failing ioctl returns -ENOMEM and a second GET_MAPPING_INFO on the same fd then blocks in drm_exec_lock_obj() on the leaked dma_resv. SIGKILL on the caller does not reap the task; the fd-release path during process exit goes through amdgpu_gem_object_close() -> drm_exec_prepare_obj() on the same lock, leaving the task in D state until the box is rebooted. The patched kernel was not rebuilt and re-tested on this hardware; the fix is mechanical. Tested on a single Lucienne / Vega box only.
Ziyi Guo posted an independent INT_MAX-bound check for args->num_entries in the same branch [1]; the two patches are complementary and can land in either order.
Fixes: 4d82724f7f2b ("drm/amdgpu: Add mapping info option for GEM_OP ioctl") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20260208000255.4073363-1-n7l8m4@u.northwestern.e... # [1] Signed-off-by: Michael Bommarito michael.bommarito@gmail.com Assisted-by: Claude:claude-opus-4-7
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 9ef80bca4102..8224fb499fdf 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -1091,8 +1091,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data, * be retried. */ vm_entries = kvcalloc(args->num_entries, sizeof(*vm_entries), GFP_KERNEL);
if (!vm_entries)return -ENOMEM;
if (!vm_entries) {r = -ENOMEM;goto out_exec;} amdgpu_vm_bo_va_for_each_valid_mapping(bo_va, mapping) { if (num_mappings < args->num_entries) {-- 2.53.0
linaro-mm-sig@lists.linaro.org