[PATCH AUTOSEL 6.17-5.15] drm/amd/pm: Use cached metrics data on aldebaran

25 Oct 2025

From: Lijo Lazar lijo.lazar@amd.com
[ Upstream commit e87577ef6daa0cfb10ca139c720f0c57bd894174 ]
Cached metrics data validity is 1ms on aldebaran. It's not reasonable
for any client to query gpu_metrics at a faster rate and constantly
interrupt PMFW.
Signed-off-by: Lijo Lazar lijo.lazar@amd.com
Reviewed-by: Asad Kamal asad.kamal@amd.com
Signed-off-by: Alex Deucher alexander.deucher@amd.com
Signed-off-by: Sasha Levin sashal@kernel.org
---
LLM Generated explanations, may be completely bogus:
YES
Rationale
- What changed: In
  `drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c:1717`, the call
  `smu_cmn_get_metrics_table(smu, &metrics, true)` is switched to `...
  false`. This flips the `bypass_cache` flag so Aldebaran’s
  `aldebaran_get_gpu_metrics()` uses the cached metrics instead of
  forcing a fresh PMFW query every time.
- Cache semantics: `smu_cmn_get_metrics_table()` caches SMU metrics for
  1 ms and refreshes only if the cache is older or bypassed. See
  `drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c:1023` (1 ms validity),
  `...:1034` (updates and timestamps cache).
- Consistency with existing Aldebaran paths: Other Aldebaran helpers
  already use the cached path, e.g. `aldebaran_get_smu_metrics_data()`
  calls `smu_cmn_get_metrics_table(smu, NULL, false)` to reuse cached
  metrics (drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c:618). This
  change makes `get_gpu_metrics` consistent with those helpers.
- Why it matters: Forcing fresh metrics on every `gpu_metrics` read
  causes frequent SMU/PMFW interactions. On Aldebaran, cached metrics
  are valid for 1 ms (as the commit message notes). Using the cache
  avoids needless PMFW interrupts when clients poll faster than 1 kHz,
  improving firmware responsiveness and reducing overhead. The returned
  data can at most be 1 ms old, which is within the defined validity
  window.
Risk and scope
- Minimal change, localized to Aldebaran: One boolean flip in an
  Aldebaran-specific function; no architectural or API changes; no
  cross-subsystem impact.
- Behavior impact is bounded: Only affects callers that poll faster than
  1 ms; they now see properly cached values (up to 1 ms old) rather than
  forcing a fresh read. This matches the established 1 ms cache policy
  in `smu_cmn_get_metrics_table`.
- Safe initialization: Metrics cache is initialized to 0 so the first
  fetch always refreshes
  (drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c:250).
- No security or correctness regressions: Reading slightly-cached
  telemetry is expected and already used elsewhere; avoids performance
  pitfalls from excessive PMFW interrupts.
Stable backport criteria
- Fixes a real-world issue (excessive PMFW interrupts / overhead under
  high-frequency polling) that can affect users.
- Small, contained change with low regression risk.
- No new features or ABI changes; aligns behavior with existing cache
  policy and other Aldebaran code paths.
- Touches a single driver component without architectural refactoring.
Given the narrow scope, clear benefit, and low risk, this is a good
candidate for stable backport.
drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
index c63d2e28954d0..b067147b7c41f 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c
@@ -1781,7 +1781,7 @@ static ssize_t aldebaran_get_gpu_metrics(struct smu_context *smu,
ret = smu_cmn_get_metrics_table(smu,
    				&metrics,
-					true);
+					false);
    if (ret)
    	return ret;
-- 
2.51.0



    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

[PATCH AUTOSEL 6.17-5.15] drm/amd/pm: Use cached metrics data on aldebaran