From: Stuart Summers stuart.summers@intel.com
[ Upstream commit 76186a253a4b9eb41c5a83224c14efdf30960a71 ]
Add a new _fini() routine on the GT TLB invalidation side to handle this worker cleanup on driver teardown.
v2: Move the TLB teardown to the gt fini() routine called during gt_init rather than in gt_alloc. This way the GT structure stays alive for while we reset the TLB state.
Signed-off-by: Stuart Summers stuart.summers@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://lore.kernel.org/r/20250826182911.392550-3-stuart.summers@intel.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes - Prevents use-after-free/hangs on driver teardown by cancelling pending TLB-invalidation workers/fences before GT resources are dismantled. The reset path already handles this during GT resets; this commit ensures the same cleanup occurs on teardown.
- Key changes and why they matter - drivers/gpu/drm/xe/xe_gt.c: `xe_gt_fini()` now calls `xe_gt_tlb_invalidation_fini(gt)` first. This ensures TLB invalidation workers/fences are cancelled while the GT is still alive, avoiding races/UAF during teardown. - drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c: Adds `xe_gt_tlb_invalidation_fini(struct xe_gt *gt)` which simply calls `xe_gt_tlb_invalidation_reset(gt)`. The reset routine: - Computes a “pending” seqno and updates `seqno_recv` so waiters see all prior invalidations as complete. - Iterates `pending_fences` and signals them, waking any kworkers waiting for TLB flush completion. - This mirrors the existing reset behavior (cancel delayed work, advance seqno, signal fences) used during GT resets to guarantee no waiter is left behind. - drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h: Adds the prototype for the new fini, keeping the API consistent.
- Concrete evidence in the code changes - The commit places `xe_gt_tlb_invalidation_fini(gt)` at the start of GT teardown (xe_gt.c: in `xe_gt_fini()`), so TLB/worker cleanup runs before `xe_hw_fence_irq_finish()` and `xe_gt_disable_host_l2_vram()`. This ordering minimizes races with IRQ/fence infrastructure and other GT resources during teardown. - The finish routine calls into the reset path which explicitly: - Sets `seqno_recv` to a value covering all outstanding requests. - Signals all pending invalidation fences via `list_for_each_entry_safe(... pending_fences ...)`, ensuring waiters are released. - This matches the comment in the reset path about kworkers not tracked by explicit TLB fences and the need to wake them assuming a full GT reset.
- Mapping to current tree (for context/impact assessment) - In this tree, the corresponding logic lives under the “tlb_inval” names: - The reset path is implemented in `drivers/gpu/drm/xe/xe_tlb_inval.c:156` (`xe_tlb_inval_reset()`), which cancels the delayed timeout work, updates `seqno_recv`, and signals all `pending_fences`. - This path is already invoked during GT reset flows (e.g., `drivers/gpu/drm/xe/xe_gt.c:853, 1067, 1139`), proving the approach is safe and battle-tested during runtime resets. - A drmm-managed teardown hook exists (`drivers/gpu/drm/xe/xe_tlb_inval.c:114`), but that operates at DRM device teardown. If GT devm teardown runs earlier, there is a window where TLB invalidation workers could outlive GT, risking UAF. Moving the cleanup into `xe_gt_fini()` (devm action, see `drivers/gpu/drm/xe/xe_gt.c:624`) closes that gap, which is exactly what this commit does in its codebase.
- Stable backport criteria - Important bugfix: avoids teardown-time UAF/hangs/leaks by cancelling and signalling all pending TLB invalidation work. - Small and contained: touches only the xe GT/TLB invalidation teardown path; adds one call-site and a thin wrapper. - No feature or architectural change: purely lifecycle/cleanup ordering. - Low regression risk: uses the same reset logic already exercised in GT reset paths. - Driver subsystem only (DRM xe), not core kernel.
- Conclusion - This is a clear, low-risk correctness fix for teardown-time resource and worker cleanup in the xe driver. It should be backported to stable trees where the xe driver and TLB invalidation workers exist, adapting symbol/file names as needed (e.g., calling `xe_tlb_inval_reset(>->tlb_inval)` from `xe_gt_fini()` in trees with the older naming).
drivers/gpu/drm/xe/xe_gt.c | 2 ++ drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c | 12 ++++++++++++ drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h | 1 + 3 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index 17634195cdc26..6f63c658c341f 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -605,6 +605,8 @@ static void xe_gt_fini(void *arg) struct xe_gt *gt = arg; int i;
+ xe_gt_tlb_invalidation_fini(gt); + for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i) xe_hw_fence_irq_finish(>->fence_irq[i]);
diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c index 086c12ee3d9de..64cd6cf0ab8df 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c @@ -173,6 +173,18 @@ void xe_gt_tlb_invalidation_reset(struct xe_gt *gt) mutex_unlock(>->uc.guc.ct.lock); }
+/** + * + * xe_gt_tlb_invalidation_fini - Clean up GT TLB invalidation state + * + * Cancel pending fence workers and clean up any additional + * GT TLB invalidation state. + */ +void xe_gt_tlb_invalidation_fini(struct xe_gt *gt) +{ + xe_gt_tlb_invalidation_reset(gt); +} + static bool tlb_invalidation_seqno_past(struct xe_gt *gt, int seqno) { int seqno_recv = READ_ONCE(gt->tlb_invalidation.seqno_recv); diff --git a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h index f7f0f2eaf4b59..3e4cff3922d6f 100644 --- a/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h +++ b/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.h @@ -16,6 +16,7 @@ struct xe_vm; struct xe_vma;
int xe_gt_tlb_invalidation_init_early(struct xe_gt *gt); +void xe_gt_tlb_invalidation_fini(struct xe_gt *gt);
void xe_gt_tlb_invalidation_reset(struct xe_gt *gt); int xe_gt_tlb_invalidation_ggtt(struct xe_gt *gt);