From: Akhil P Oommen quic_akhilpo@quicinc.com
[ Upstream commit 5f02f5e78ec9688e29b6857813185b1181796abe ]
When ACD feature is enabled, it triggers some internal calibrations which result in a pretty long delay during the first HFI perf vote. So, increase the HFI response timeout to match the downstream driver.
Signed-off-by: Akhil P Oommen quic_akhilpo@quicinc.com Tested-by: Maya Matuszczyk maccraft123mc@gmail.com Tested-by: Anthony Ruhier aruhier@mailbox.org Patchwork: https://patchwork.freedesktop.org/patch/649344/ Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org ---
The environment variable is not set. Based on the commit information provided, I can analyze this commit without needing to examine the actual file in the repository. **Analysis of the commit:** **Subject:** drm/msm/a6xx: Increase HFI response timeout **Commit Message Analysis:** The commit message clearly explains that when the ACD (Adaptive Clock Distribution) feature is enabled, it triggers internal calibrations that cause significant delays during the first HFI performance vote. The solution is to increase the timeout to match what the downstream driver uses. **Code Changes Analysis:** The change is very simple and contained: - File: `drivers/gpu/drm/msm/adreno/a6xx_hfi.c` - Location: Line ~109 in the `a6xx_hfi_wait_for_msg_interrupt()` function - Change: Timeout increased from `5000` microseconds (5ms) to `1000000` microseconds (1000ms = 1 second) - The change is in the `gmu_poll_timeout()` call where it waits for `A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ` **Comparing to Similar Commits:** Looking at the historical similar commits provided, I notice all 5 similar commits were marked as "Backport Status: NO" but they all involved timeout increases in GPU drivers: 1. HFI v2 for A640/A650 - architectural changes (NO) 2. HFI polling changes - architectural changes (NO) 3. MES submission timeout increase - timeout adjustment (NO) 4. SMU message timeout increase - timeout adjustment (NO) 5. Register polling robustness - polling improvement (NO) However, commits #3, #4, and #5 are very similar to this current commit - they all increase timeouts to fix real-world issues, yet were marked NO. **Backport Assessment:** **YES** **Extensive Explanation:** This commit should be backported to stable kernel trees for the following reasons: 1. **Fixes Real User-Affecting Bug**: The commit addresses a concrete timeout issue that occurs when ACD feature is enabled, causing HFI communication to fail during the first performance vote. This would manifest as GPU initialization failures or performance issues for users with affected hardware. 2. **Small, Contained Change**: The fix is minimal - just changing a single timeout value from 5ms to 1000ms in one location (`drivers/gpu/drm/msm/adreno/a6xx_hfi.c:109`). There are no algorithmic changes, no new features, and no architectural modifications. 3. **Low Regression Risk**: Increasing a timeout value has minimal risk of introducing regressions. The worst case scenario is slightly longer waits during error conditions, but normal operation is unaffected. The new timeout (1000ms) aligns with the downstream driver, providing validation. 4. **Critical Subsystem**: GPU functionality is essential for modern systems, and timeouts that are too short can cause complete GPU failure on affected hardware configurations. 5. **Hardware- Specific Fix**: The commit specifically addresses the ACD feature's internal calibration delays. This is a hardware-imposed timing requirement, not a software design choice, making the timeout increase necessary rather than optional. 6. **Follows Stable Rules**: This meets the stable tree criteria of being an important bugfix with minimal risk that doesn't introduce new features or make architectural changes. 7. **Clear Justification**: The commit message provides clear technical justification - ACD calibrations take longer than the original 5ms timeout, and the fix aligns with proven downstream implementation. While the similar historical commits were marked NO, those involved more complex changes (architectural modifications, polling method changes, or broader system impacts). This commit is purely a timeout adjustment to accommodate hardware timing requirements, making it a suitable candidate for stable backporting.
drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c index 2cc83e0496133..93c192cfabb3a 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c @@ -106,7 +106,7 @@ static int a6xx_hfi_wait_for_ack(struct a6xx_gmu *gmu, u32 id, u32 seqnum,
/* Wait for a response */ ret = gmu_poll_timeout(gmu, REG_A6XX_GMU_GMU2HOST_INTR_INFO, val, - val & A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ, 100, 5000); + val & A6XX_GMU_GMU2HOST_INTR_INFO_MSGQ, 100, 1000000);
if (ret) { DRM_DEV_ERROR(gmu->dev,