[PATCH AUTOSEL 6.17] drm/msm/adreno: Add fenced regwrite support

25 Oct 2025

From: Akhil P Oommen akhilpo@oss.qualcomm.com
[ Upstream commit a27d774045566b587bfc1ae9fb122642b06677b8 ]
There are some special registers which are accessible even when GX power
domain is collapsed during an IFPC sleep. Accessing these registers
wakes up GPU from power collapse and allow programming these registers
without additional handshake with GMU. This patch adds support for this
special register write sequence.
Signed-off-by: Akhil P Oommen akhilpo@oss.qualcomm.com
Patchwork: https://patchwork.freedesktop.org/patch/673368/
Signed-off-by: Rob Clark robin.clark@oss.qualcomm.com
Signed-off-by: Sasha Levin sashal@kernel.org
---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes
  - Prevents register writes from being “dropped” while the GPU’s GX
    power domain is collapsed during IFPC. The new fenced write path
    forces a retry and waits for the GMU AHB fence to move to allow-mode
    so the write actually sticks. Without this, key writes (preemption
    trigger, ring wptr restore) can be lost, leading to missed
    preemption, scheduling stalls, or timeouts under IFPC.
- Scope and changes
  - Introduces a contained helper that performs a write, issues a heavy
    barrier, and polls GMU AHB fence status, retrying briefly if the
    write was dropped:
    - `drivers/gpu/drm/msm/adreno/a6xx_gpu.c:32` adds
      `fence_status_check()` which retries after writedropped and
      enforces `mb()` barriers.
    - `drivers/gpu/drm/msm/adreno/a6xx_gpu.c:50` adds `fenced_write()`
      that polls `REG_A6XX_GMU_AHB_FENCE_STATUS` via
      `gmu_poll_timeout()` for up to 2ms and logs rate-limited errors on
      delay/fail.
    - `drivers/gpu/drm/msm/adreno/a6xx_gpu.c:94` adds
      `a6xx_fenced_write()` handling 32b/64b register pairs.
  - Converts critical writes to this fenced path:
    - Ring flush path: `drivers/gpu/drm/msm/adreno/a6xx_gpu.c:180`
      switches `REG_A6XX_CP_RB_WPTR` to fenced write (mask `BIT(0)`).
    - Preemption wptr restore:
      `drivers/gpu/drm/msm/adreno/a6xx_preempt.c:51` uses fenced write
      for `REG_A6XX_CP_RB_WPTR`.
    - Preemption trigger and context setup:
      - `drivers/gpu/drm/msm/adreno/a6xx_preempt.c:319` uses fenced 64b
        write for `REG_A6XX_CP_CONTEXT_SWITCH_SMMU_INFO` (mask
        `BIT(1)`).
      - `drivers/gpu/drm/msm/adreno/a6xx_preempt.c:319` uses fenced 64b
        write for
        `REG_A6XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR` (mask
        `BIT(1)`).
      - `drivers/gpu/drm/msm/adreno/a6xx_preempt.c:350` uses fenced
        write for `REG_A6XX_CP_CONTEXT_SWITCH_CNTL` (mask `BIT(1)`).
- Why this is safe and appropriate for stable
  - Small and surgical: Changes are isolated to the msm/adreno a6xx
    driver and only replace a few direct writes with a robust, bounded
    poll-and-retry sequence.
  - Minimal risk path selection:
    - On platforms without a “real” GMU (GMU wrapper), the new helper
      fast-paths out and behaves like the previous code
      (`drivers/gpu/drm/msm/adreno/adreno_gpu.h:274` and
      `drivers/gpu/drm/msm/adreno/a6xx_gpu.c:60`).
    - When not in IFPC (or when fence is already in allow mode), the
      condition evaluates immediately and returns without delay
      (`drivers/gpu/drm/msm/adreno/a6xx_gmu.h:169` for
      `gmu_poll_timeout`).
    - Time-bounded: two 1ms polls with short udelays; errors are rate-
      limited and the call sites do not introduce new failure paths
      relative to pre-existing behavior.
  - Aligned with existing GMU AHB fence machinery already in-tree:
    - Fence ranges configured during GMU bring-up
      (`drivers/gpu/drm/msm/adreno/a6xx_gmu.c:897` writes
      `REG_A6XX_GMU_AHB_FENCE_RANGE_0` to cover the CP context switch
      region; register locations:
      `drivers/gpu/drm/msm/registers/adreno/a6xx.xml:167`
      `CP_CONTEXT_SWITCH_CNTL`,
      `drivers/gpu/drm/msm/registers/adreno/a6xx.xml:173`
      `CP_CONTEXT_SWITCH_SMMU_INFO`,
      `drivers/gpu/drm/msm/registers/adreno/a6xx.xml:174`
      `CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR`,
      `drivers/gpu/drm/msm/registers/adreno/adreno_common.xml:129`
      `CP_RB_WPTR`).
    - WRITEDROPPED status fields are cleared elsewhere as part of
      power/control management
      (`drivers/gpu/drm/msm/adreno/a6xx_gmu.c:1044`), and the helper
      polls the same `REG_A6XX_GMU_AHB_FENCE_STATUS`.
  - Fixes real-world IFPC races and intermittent failures:
    - IFPC has the GMU put the AHB fence into drop mode during
      collapses; writes to certain CP registers are “special” and
      intended to wake/allow programming without extra handshakes, but
      need the fence status polling to be reliable. This patch makes
      those writes robust against brief GMU/IFPC races.
- Stable policy considerations
  - Bugfix vs. feature: This is a correctness/reliability fix under
    IFPC, not a new feature. It eliminates spurious failures where
    preemption and ring write-pointer updates are dropped due to IFPC-
    related fencing. No ABI/API changes.
  - Limited blast radius: Touches only a6xx GPU driver paths; does not
    alter core DRM or other subsystems.
  - No architectural overhaul: It adds a helper and swaps a few reg
    writes; the GMU fence infra it depends on is already present in the
    driver.
- Potential backport prerequisites
  - Ensure the target stable tree includes the GMU AHB fence
    control/range support used by the helper (e.g.,
    `REG_A6XX_GMU_AHB_FENCE_STATUS`, `REG_A6XX_GMU_AHB_FENCE_RANGE_0`,
    `REG_A6XX_GMU_AO_AHB_FENCE_CTRL`) and `gmu_poll_timeout()`
    (`drivers/gpu/drm/msm/adreno/a6xx_gmu.h:169`).
  - If IFPC is not enabled for specific GPUs in a given stable, this
    change is effectively a no-op on those platforms and remains safe.
Conclusion: This is a contained, low-risk reliability fix for IFPC-
capable Adreno a6xx/a7xx paths, preventing dropped writes to critical CP
registers during power-collapsed states. It meets stable backport
criteria and should be backported.
drivers/gpu/drm/msm/adreno/a6xx_gpu.c     | 80 ++++++++++++++++++++++-
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h     |  1 +
 drivers/gpu/drm/msm/adreno/a6xx_preempt.c | 20 +++---
 3 files changed, 90 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index f8992a68df7fb..536da1acf615e 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -16,6 +16,84 @@
#define GPU_PAS_ID 13
+static bool fence_status_check(struct msm_gpu *gpu, u32 offset, u32 value, u32 status, u32 mask)
+{
+	/* Success if !writedropped0/1 */
+	if (!(status & mask))
+		return true;
+
+	udelay(10);
+
+	/* Try to update fenced register again */
+	gpu_write(gpu, offset, value);
+
+	/* We can't do a posted write here because the power domain could be
+	 * in collapse state. So use the heaviest barrier instead
+	 */
+	mb();
+	return false;
+}
+
+static int fenced_write(struct a6xx_gpu *a6xx_gpu, u32 offset, u32 value, u32 mask)
+{
+	struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
+	struct msm_gpu *gpu = &adreno_gpu->base;
+	struct a6xx_gmu *gmu = &a6xx_gpu->gmu;
+	u32 status;
+
+	gpu_write(gpu, offset, value);
+
+	/* Nothing else to be done in the case of no-GMU */
+	if (adreno_has_gmu_wrapper(adreno_gpu))
+		return 0;
+
+	/* We can't do a posted write here because the power domain could be
+	 * in collapse state. So use the heaviest barrier instead
+	 */
+	mb();
+
+	if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, status,
+			fence_status_check(gpu, offset, value, status, mask), 0, 1000))
+		return 0;
+
+	/* Try again for another 1ms before failing */
+	gpu_write(gpu, offset, value);
+	mb();
+
+	if (!gmu_poll_timeout(gmu, REG_A6XX_GMU_AHB_FENCE_STATUS, status,
+			fence_status_check(gpu, offset, value, status, mask), 0, 1000)) {
+		/*
+		 * The 'delay' warning is here because the pause to print this
+		 * warning will allow gpu to move to power collapse which
+		 * defeats the purpose of continuous polling for 2 ms
+		 */
+		dev_err_ratelimited(gmu->dev, "delay in fenced register write (0x%x)\n",
+				offset);
+		return 0;
+	}
+
+	dev_err_ratelimited(gmu->dev, "fenced register write (0x%x) fail\n",
+			offset);
+
+	return -ETIMEDOUT;
+}
+
+int a6xx_fenced_write(struct a6xx_gpu *a6xx_gpu, u32 offset, u64 value, u32 mask, bool is_64b)
+{
+	int ret;
+
+	ret = fenced_write(a6xx_gpu, offset, lower_32_bits(value), mask);
+	if (ret)
+		return ret;
+
+	if (!is_64b)
+		return 0;
+
+	ret = fenced_write(a6xx_gpu, offset + 1, upper_32_bits(value), mask);
+
+	return ret;
+}
+
 static inline bool _a6xx_check_idle(struct msm_gpu *gpu)
 {
    struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
@@ -86,7 +164,7 @@ static void a6xx_flush(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
    /* Update HW if this is the current ring and we are not in preempt*/
    if (!a6xx_in_preempt(a6xx_gpu)) {
    	if (a6xx_gpu->cur_ring == ring)
-			gpu_write(gpu, REG_A6XX_CP_RB_WPTR, wptr);
+			a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
    	else
    		ring->restore_wptr = true;
    } else {
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
index 6e71f617fc3d0..e736c59d566b3 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.h
@@ -295,5 +295,6 @@ int a6xx_gpu_state_put(struct msm_gpu_state *state);
void a6xx_bus_clear_pending_transactions(struct adreno_gpu *adreno_gpu, bool gx_off);
 void a6xx_gpu_sw_reset(struct msm_gpu *gpu, bool assert);
+int a6xx_fenced_write(struct a6xx_gpu *gpu, u32 offset, u64 value, u32 mask, bool is_64b);
#endif /* __A6XX_GPU_H__ */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
index 6a12a35dabff1..10625ffbc4cfc 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_preempt.c
@@ -41,7 +41,7 @@ static inline void set_preempt_state(struct a6xx_gpu *gpu,
 }
/* Write the most recent wptr for the given ring into the hardware */
-static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
+static inline void update_wptr(struct a6xx_gpu *a6xx_gpu, struct msm_ringbuffer *ring)
 {
    unsigned long flags;
    uint32_t wptr;
@@ -51,7 +51,7 @@ static inline void update_wptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
    if (ring->restore_wptr) {
    	wptr = get_wptr(ring);
-		gpu_write(gpu, REG_A6XX_CP_RB_WPTR, wptr);
+		a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_RB_WPTR, wptr, BIT(0), false);
ring->restore_wptr = false;
    }
@@ -172,7 +172,7 @@ void a6xx_preempt_irq(struct msm_gpu *gpu)
set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
-	update_wptr(gpu, a6xx_gpu->cur_ring);
+	update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
set_preempt_state(a6xx_gpu, PREEMPT_NONE);
@@ -268,7 +268,7 @@ void a6xx_preempt_trigger(struct msm_gpu *gpu)
     */
    if (!ring || (a6xx_gpu->cur_ring == ring)) {
    	set_preempt_state(a6xx_gpu, PREEMPT_FINISH);
-		update_wptr(gpu, a6xx_gpu->cur_ring);
+		update_wptr(a6xx_gpu, a6xx_gpu->cur_ring);
    	set_preempt_state(a6xx_gpu, PREEMPT_NONE);
    	spin_unlock_irqrestore(&a6xx_gpu->eval_lock, flags);
    	return;
@@ -302,13 +302,13 @@ void a6xx_preempt_trigger(struct msm_gpu *gpu)
spin_unlock_irqrestore(&ring->preempt_lock, flags);
-	gpu_write64(gpu,
-		REG_A6XX_CP_CONTEXT_SWITCH_SMMU_INFO,
-		a6xx_gpu->preempt_smmu_iova[ring->id]);
+	a6xx_fenced_write(a6xx_gpu,
+		REG_A6XX_CP_CONTEXT_SWITCH_SMMU_INFO, a6xx_gpu->preempt_smmu_iova[ring->id],
+		BIT(1), true);
-	gpu_write64(gpu,
+	a6xx_fenced_write(a6xx_gpu,
    	REG_A6XX_CP_CONTEXT_SWITCH_PRIV_NON_SECURE_RESTORE_ADDR,
-		a6xx_gpu->preempt_iova[ring->id]);
+		a6xx_gpu->preempt_iova[ring->id], BIT(1), true);
a6xx_gpu->next_ring = ring;
@@ -328,7 +328,7 @@ void a6xx_preempt_trigger(struct msm_gpu *gpu)
    set_preempt_state(a6xx_gpu, PREEMPT_TRIGGERED);
/* Trigger the preemption */
-	gpu_write(gpu, REG_A6XX_CP_CONTEXT_SWITCH_CNTL, cntl);
+	a6xx_fenced_write(a6xx_gpu, REG_A6XX_CP_CONTEXT_SWITCH_CNTL, cntl, BIT(1), false);
 }
static int preempt_init_ring(struct a6xx_gpu *a6xx_gpu,
-- 
2.51.0



    

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH AUTOSEL 6.17] drm/msm/adreno: Add fenced regwrite support