From: Yang Wang kevinyang.wang@amd.com
[ Upstream commit cf32515a70618c0fb2319bd4a855f4d9447940a8 ]
v1: Returns different error codes based on the scenario to help the user app understand the AMDGPU device status when an exception occurs.
v2: change -NODEV to -EBUSY.
Signed-off-by: Yang Wang kevinyang.wang@amd.com Reviewed-by: Lijo Lazar lijo.lazar@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
- What changed - The function `amdgpu_pm_dev_state_check()` now returns `-EBUSY` instead of `-EPERM` when the device is in GPU reset or system suspend: - `drivers/gpu/drm/amd/pm/amdgpu_pm.c:112`: `if (amdgpu_in_reset(adev)) return -EBUSY;` (was `-EPERM`) - `drivers/gpu/drm/amd/pm/amdgpu_pm.c:115`: `if (adev->in_suspend && !runpm_check) return -EBUSY;` (was `-EPERM`) - This function gates access in `amdgpu_pm_get_access()` and `amdgpu_pm_get_access_if_active()`: - `drivers/gpu/drm/amd/pm/amdgpu_pm.c:133`: `ret = amdgpu_pm_dev_state_check(adev, true);` - `drivers/gpu/drm/amd/pm/amdgpu_pm.c:153`: `ret = amdgpu_pm_dev_state_check(adev, false);` - Numerous PM-related sysfs show/store handlers directly return the `ret` from these helpers (e.g., `amdgpu_get_power_dpm_state()` returns `ret` on failure), so the errno visible to userspace changes from `-EPERM` to `-EBUSY` when the device is resetting or suspended (example call and return: `drivers/gpu/drm/amd/pm/amdgpu_pm.c:217` onward in the `amdgpu_get_power_dpm_state` path shows the pattern of `ret = ...; if (ret) return ret;`).
- Why it’s a bug fix suitable for stable - Correctness/semantics: `-EPERM` indicates a permissions problem, which is misleading here; the device is temporarily unavailable due to reset or suspend. `-EBUSY` accurately communicates a transient busy state and invites retry, which aligns better with userspace expectations and error handling. - Scope and risk: The change is tiny and localized to return codes in a single helper. It does not alter call sequences, state checks, PM flows, or locking. No ABI or uAPI additions, no structural/architectural changes. - Impacted surface: Only sysfs PM nodes’ errno in specific exceptional states. In-kernel callers are not affected (the helpers are `static`). Userspace seeing `-EBUSY` instead of `-EPERM` is an improvement for diagnostics and retry logic. AMDGPU already returns `-EBUSY` in analogous busy conditions elsewhere, so this aligns with existing patterns. - Stability: No performance, functional, or security regression vectors are introduced. The remaining `-EPERM` usage in `amdgpu_pm_get_access_if_active()` when the device is not active (`drivers/gpu/drm/amd/pm/amdgpu_pm.c:163`) is untouched, keeping behavior consistent for that distinct case.
- Backport considerations - The patch is self-contained and minimal. If the target stable trees already have `amdgpu_pm_dev_state_check()` and the access helpers, this applies cleanly. If older trees did the checks inline in each sysfs op, the backport would require equivalent one-line replacements at those sites; still trivial and low risk. - No user-visible feature change; only corrected errno in exceptional conditions, which is a common and acceptable class of stable fixes.
Given the small, precise nature of the change, its correctness benefit, and minimal regression risk, this is a good candidate for stable backporting.
drivers/gpu/drm/amd/pm/amdgpu_pm.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/amdgpu_pm.c b/drivers/gpu/drm/amd/pm/amdgpu_pm.c index 5fbfe7333b54d..1fca183827c7c 100644 --- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -110,9 +110,10 @@ static int amdgpu_pm_dev_state_check(struct amdgpu_device *adev, bool runpm) bool runpm_check = runpm ? adev->in_runpm : false;
if (amdgpu_in_reset(adev)) - return -EPERM; + return -EBUSY; + if (adev->in_suspend && !runpm_check) - return -EPERM; + return -EBUSY;
return 0; }