From: Aurabindo Pillai aurabindo.pillai@amd.com
[ Upstream commit c210b757b400959577a5a17b783b5959b82baed8 ]
Accessing DC from amdgpu_dm is usually preceded by acquisition of dc_lock mutex. Most of the DC API that DM calls are under a DC lock. However, there are a few that are not. Some DC API called from interrupt context end up sending DMUB commands via a DC API, while other threads were using DMUB. This was apparent from a race between calls for setting idle optimization enable/disable and the DC API to set vmin/vmax.
Offload the call to dc_stream_adjust_vmin_vmax() to a thread instead of directly calling them from the interrupt handler such that it waits for dc_lock.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Roman Li roman.li@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes - Addresses a real race where DCN/DM code issues DMUB commands from IRQ context without the normal `dc_lock`, while other threads concurrently use DMUB. The commit message cites a concrete race between idle optimization enable/disable and vmin/vmax updates. This is user-visible (VRR/BTR timing adjustments) and can cause instability or incorrect behavior.
- Where the race is today - `dc_stream_adjust_vmin_vmax()` is invoked directly from high IRQ handlers without taking `dc_lock`: - `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:593` inside `dm_vupdate_high_irq()` (pre-AI BTR path) calls `dc_stream_adjust_vmin_vmax(...)` while holding `event_lock`. - `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:686` inside `dm_crtc_high_irq()` (AI and newer path) does the same. - Many other DC/DM paths guard DMUB access with `adev->dm.dc_lock`, so these IRQ paths are outliers that can race.
- What the patch changes - Adds a small offload mechanism to move the vmin/vmax update out of IRQ context and under `dc_lock`: - New work handler: `dm_handle_vmin_vmax_update()` acquires `adev->dm.dc_lock`, calls `dc_stream_adjust_vmin_vmax(adev->dm.dc, stream, adjust)`, then releases the lock and cleans up. - New helper: `schedule_dc_vmin_vmax(adev, stream, adjust)` retains the stream, copies the adjust struct, initializes a work item, and `queue_work(system_wq, ...)`. - Adds `struct vupdate_offload_work` in `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h` to carry `adev`, `stream`, and `adjust` through the workqueue. - Replaces the direct IRQ-time calls to `dc_stream_adjust_vmin_vmax()` with `schedule_dc_vmin_vmax(...)` in both IRQ paths: - `dm_vupdate_high_irq()` patch hunk replaces the direct call (was at `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:593`) with `schedule_dc_vmin_vmax(...)`. - `dm_crtc_high_irq()` patch hunk replaces the direct call (was at `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c:686`) with `schedule_dc_vmin_vmax(...)`.
- Why this is a good stable candidate - Important bug fix: Prevents concurrent DMUB accesses, which are known to cause issues in VRR/BTR updates and idle optimizations. - Small and contained: Only touches AMD display (`amdgpu_dm.c`, `amdgpu_dm.h`), adds a ~localized work item and defers a single API call. - Matches established patterns: DM frequently defers DMUB operations when it may need `dc_lock` (e.g., ROI/CRC path defers work and explicitly notes it may need to wait for `dc_lock`, see `drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_crc.c:867` and related schedule_work usage). - No uAPI or architectural changes: Purely internal synchronization/dispatch change to align with DC/DM locking rules. - Reviewed and tested: Has Reviewed-by/Tested-by/Signed-off-by from AMD maintainers, increasing confidence.
- Risk and considerations - Timing sensitivity: Moving `dc_stream_adjust_vmin_vmax()` out of the IRQ to a worker may slightly delay DRR/VRR timing updates. AMD already uses deferred work for DMUB operations elsewhere and the patch protects the call with `dc_lock`, so the tradeoff is correctness over micro-timing. This should be safe and consistent with the rest of DC/DM locking. - Allocation in IRQ context: The provided diff shows `schedule_dc_vmin_vmax()` uses `kzalloc(..., GFP_KERNEL)` while it is invoked from high IRQ context and while holding `event_lock`. In IRQ context, `GFP_KERNEL` can sleep and is not permitted; at minimum `GFP_ATOMIC` is required, or the allocation should be moved out of the locked IRQ region. Stable backport should include any follow-up fixing the GFP flags or move the call site outside the spinlock. The design (deferral + `dc_lock`) is right, but the allocation flags must be IRQ-safe. - Stream lifetime: The patch correctly uses `dc_stream_retain()` in the IRQ path and `dc_stream_release()` in the worker, preventing use-after-free.
- Stable tree fit - Fixes a real race affecting users (VRR/FreeSync/BTR correctness). - Minimal, localized change to the AMD display driver; no API or broad subsystem changes. - Aligns with stable rules for important bugfixes with low regression risk. - Recommendation: Backport together with (or adjusted to include) IRQ- safe memory allocation (e.g., use `GFP_ATOMIC` for the offload work and adjust-copy allocations, or allocate outside the IRQ spinlock) to avoid introducing a new IRQ-sleep regression.
In sum, this is a targeted concurrency fix for DMUB access and a strong candidate for stable, with the caveat to ensure IRQ-safe allocations in the offload path when backporting.
.../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 55 +++++++++++++++++-- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h | 14 +++++ 2 files changed, 63 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 163780030eb16..aca57cc815514 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -541,6 +541,50 @@ static void dm_pflip_high_irq(void *interrupt_params) amdgpu_crtc->crtc_id, amdgpu_crtc, vrr_active, (int)!e); }
+static void dm_handle_vmin_vmax_update(struct work_struct *offload_work) +{ + struct vupdate_offload_work *work = container_of(offload_work, struct vupdate_offload_work, work); + struct amdgpu_device *adev = work->adev; + struct dc_stream_state *stream = work->stream; + struct dc_crtc_timing_adjust *adjust = work->adjust; + + mutex_lock(&adev->dm.dc_lock); + dc_stream_adjust_vmin_vmax(adev->dm.dc, stream, adjust); + mutex_unlock(&adev->dm.dc_lock); + + dc_stream_release(stream); + kfree(work->adjust); + kfree(work); +} + +static void schedule_dc_vmin_vmax(struct amdgpu_device *adev, + struct dc_stream_state *stream, + struct dc_crtc_timing_adjust *adjust) +{ + struct vupdate_offload_work *offload_work = kzalloc(sizeof(*offload_work), GFP_KERNEL); + if (!offload_work) { + drm_dbg_driver(adev_to_drm(adev), "Failed to allocate vupdate_offload_work\n"); + return; + } + + struct dc_crtc_timing_adjust *adjust_copy = kzalloc(sizeof(*adjust_copy), GFP_KERNEL); + if (!adjust_copy) { + drm_dbg_driver(adev_to_drm(adev), "Failed to allocate adjust_copy\n"); + kfree(offload_work); + return; + } + + dc_stream_retain(stream); + memcpy(adjust_copy, adjust, sizeof(*adjust_copy)); + + INIT_WORK(&offload_work->work, dm_handle_vmin_vmax_update); + offload_work->adev = adev; + offload_work->stream = stream; + offload_work->adjust = adjust_copy; + + queue_work(system_wq, &offload_work->work); +} + static void dm_vupdate_high_irq(void *interrupt_params) { struct common_irq_params *irq_params = interrupt_params; @@ -590,10 +634,9 @@ static void dm_vupdate_high_irq(void *interrupt_params) acrtc->dm_irq_params.stream, &acrtc->dm_irq_params.vrr_params);
- dc_stream_adjust_vmin_vmax( - adev->dm.dc, - acrtc->dm_irq_params.stream, - &acrtc->dm_irq_params.vrr_params.adjust); + schedule_dc_vmin_vmax(adev, + acrtc->dm_irq_params.stream, + &acrtc->dm_irq_params.vrr_params.adjust); spin_unlock_irqrestore(&adev_to_drm(adev)->event_lock, flags); } } @@ -683,8 +726,8 @@ static void dm_crtc_high_irq(void *interrupt_params) acrtc->dm_irq_params.stream, &acrtc->dm_irq_params.vrr_params);
- dc_stream_adjust_vmin_vmax(adev->dm.dc, acrtc->dm_irq_params.stream, - &acrtc->dm_irq_params.vrr_params.adjust); + schedule_dc_vmin_vmax(adev, acrtc->dm_irq_params.stream, + &acrtc->dm_irq_params.vrr_params.adjust); }
/* diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h index b937da0a4e4a0..c18a6b43c76f6 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h @@ -152,6 +152,20 @@ struct idle_workqueue { bool running; };
+/** + * struct dm_vupdate_work - Work data for periodic action in idle + * @work: Kernel work data for the work event + * @adev: amdgpu_device back pointer + * @stream: DC stream associated with the crtc + * @adjust: DC CRTC timing adjust to be applied to the crtc + */ +struct vupdate_offload_work { + struct work_struct work; + struct amdgpu_device *adev; + struct dc_stream_state *stream; + struct dc_crtc_timing_adjust *adjust; +}; + #define MAX_LUMINANCE_DATA_POINTS 99
/**