[PATCH AUTOSEL 6.17] drm/amdgpu/vpe: cancel delayed work in hw_fini

25 Oct 2025

From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit ec813f384b1a9df332e86ff46c422e5d2d00217f ]
We need to cancel any outstanding work at both suspend
and driver teardown. Move the cancel to hw_fini which
gets called in both cases.
Reviewed-by: David (Ming Qiang) Wu David.Wu3@amd.com
Signed-off-by: Alex Deucher alexander.deucher@amd.com
Signed-off-by: Sasha Levin sashal@kernel.org
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes: Prevents a race/UAF by ensuring the VPE idle delayed
  work is canceled during both suspend and full driver teardown. The
  idle work is initialized and used for power-gating idling
  (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:355) and is scheduled in
  normal operation (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:873) and can
  reschedule itself (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:336).
  Without canceling it on teardown, it may run after the ring is stopped
  or while/after resources are being freed.
- Change details:
  - Added cancellation in teardown path:
    `cancel_delayed_work_sync(&adev->vpe.idle_work);` in `vpe_hw_fini()`
    so it runs on both suspend and teardown
    (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:439).
  - Simplified suspend path to delegate to `vpe_hw_fini()`
    (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:449–451), ensuring the same
    cancel happens on suspend.
  - The IP block ops confirm suspend and hw_fini are the hooks used by
    both flows (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:978–981).
- Why it matters: The idle work handler toggles power gating
  (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:333–336). If it remains
  queued while the ring is being stopped
  (drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:441) or while memory is being
  torn down in `sw_fini()`, it can touch freed objects or hardware in an
  invalid state. Moving the cancel to `hw_fini()` guarantees it runs
  before ring stop and power gating, and before subsequent teardown.
- Scope and risk:
  - Small, contained change in a single driver file; no API or
    architectural change.
  - Uses standard `cancel_delayed_work_sync`, which safely flushes and
    prevents requeue.
  - Consistent with existing practice elsewhere in amdgpu (e.g.,
    cancellation when starting to use the ring:
    drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c:849).
  - Low regression risk; only affects teardown/suspend sequencing of a
    driver-local delayed work.
- Stable backport fit:
  - Fixes a real race that can cause crashes during suspend/unload
    (user-visible reliability issue).
  - Minimal and targeted; no new features.
  - Touches a non-core subsystem (amdgpu VPE), keeping risk bounded.
Conclusion: This is a clear, low-risk bug fix that aligns with stable
rules and should be backported.
drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
index 121ee17b522bd..dcdb2654ceb4a 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vpe.c
@@ -435,6 +435,8 @@ static int vpe_hw_fini(struct amdgpu_ip_block *ip_block)
    struct amdgpu_device *adev = ip_block->adev;
    struct amdgpu_vpe *vpe = &adev->vpe;
+	cancel_delayed_work_sync(&adev->vpe.idle_work);
+
    vpe_ring_stop(vpe);
/* Power off VPE */
@@ -445,10 +447,6 @@ static int vpe_hw_fini(struct amdgpu_ip_block *ip_block)
static int vpe_suspend(struct amdgpu_ip_block *ip_block)
 {
-	struct amdgpu_device *adev = ip_block->adev;
-
-	cancel_delayed_work_sync(&adev->vpe.idle_work);
-
    return vpe_hw_fini(ip_block);
 }
-- 
2.51.0



    

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH AUTOSEL 6.17] drm/amdgpu/vpe: cancel delayed work in hw_fini