It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Cc: stable@vger.kernel.org Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858 Signed-off-by: Alex Deucher alexander.deucher@amd.com --- drivers/gpu/drm/amd/amdgpu/cik.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/vi.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 54f28c075f21..f10ce740a29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -1428,6 +1428,10 @@ static int cik_asic_reset(struct amdgpu_device *adev) { int r;
+ /* APUs don't have full asic reset */ + if (adev->flags & AMD_IS_APU) + return 0; + if (cik_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c index fe9a7cc8d9eb..6645ebbd2696 100644 --- a/drivers/gpu/drm/amd/amdgpu/vi.c +++ b/drivers/gpu/drm/amd/amdgpu/vi.c @@ -956,6 +956,10 @@ static int vi_asic_reset(struct amdgpu_device *adev) { int r;
+ /* APUs don't have full asic reset */ + if (adev->flags & AMD_IS_APU) + return 0; + if (vi_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev);
Acked-by: Guchun Chen guchun.chen@amd.com
Regards, Guchun
-----Original Message----- From: amd-gfx amd-gfx-bounces@lists.freedesktop.org On Behalf Of Alex Deucher Sent: Thursday, January 13, 2022 12:01 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Alexander.Deucher@amd.com; stable@vger.kernel.org Subject: [PATCH] drm/amdgpu: don't do resets on APUs which don't support it
It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Cc: stable@vger.kernel.org Bug: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.fre... Signed-off-by: Alex Deucher alexander.deucher@amd.com --- drivers/gpu/drm/amd/amdgpu/cik.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/vi.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 54f28c075f21..f10ce740a29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -1428,6 +1428,10 @@ static int cik_asic_reset(struct amdgpu_device *adev) { int r;
+ /* APUs don't have full asic reset */ + if (adev->flags & AMD_IS_APU) + return 0; + if (cik_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev); diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c index fe9a7cc8d9eb..6645ebbd2696 100644 --- a/drivers/gpu/drm/amd/amdgpu/vi.c +++ b/drivers/gpu/drm/amd/amdgpu/vi.c @@ -956,6 +956,10 @@ static int vi_asic_reset(struct amdgpu_device *adev) { int r;
+ /* APUs don't have full asic reset */ + if (adev->flags & AMD_IS_APU) + return 0; + if (vi_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev); -- 2.34.1
Hi Alex,
What about something like this?
bool amdgpu_device_reset_on_suspend(struct amdgpu_device *adev) { if (adev->in_s0ix || adev->gmc.xgmi.num_physical_nodes > 1) return false;
switch (amdgpu_asic_reset_method(adev)) { case AMD_RESET_METHOD_BACO: case AMD_RESET_METHOD_MODE1: case AMD_RESET_METHOD_MODE2: return true; }
return false; }
Thanks, Lijo
On 1/13/2022 9:31 AM, Alex Deucher wrote:
It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Cc: stable@vger.kernel.org Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858 Signed-off-by: Alex Deucher alexander.deucher@amd.com
drivers/gpu/drm/amd/amdgpu/cik.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/vi.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 54f28c075f21..f10ce740a29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -1428,6 +1428,10 @@ static int cik_asic_reset(struct amdgpu_device *adev) { int r;
- /* APUs don't have full asic reset */
- if (adev->flags & AMD_IS_APU)
return 0;
- if (cik_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c index fe9a7cc8d9eb..6645ebbd2696 100644 --- a/drivers/gpu/drm/amd/amdgpu/vi.c +++ b/drivers/gpu/drm/amd/amdgpu/vi.c @@ -956,6 +956,10 @@ static int vi_asic_reset(struct amdgpu_device *adev) { int r;
- /* APUs don't have full asic reset */
- if (adev->flags & AMD_IS_APU)
return 0;
- if (vi_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev);
On Thu, Jan 13, 2022 at 1:56 AM Lazar, Lijo lijo.lazar@amd.com wrote:
Hi Alex,
What about something like this?
bool amdgpu_device_reset_on_suspend(struct amdgpu_device *adev) { if (adev->in_s0ix || adev->gmc.xgmi.num_physical_nodes > 1) return false;
switch (amdgpu_asic_reset_method(adev)) { case AMD_RESET_METHOD_BACO: case AMD_RESET_METHOD_MODE1: case AMD_RESET_METHOD_MODE2:
This should also work on AMD_RESET_METHOD_LEGACY, at least for dGPUs. I think the current approach is probably better since I don't think GPU resets work reliably on these chips anyway (it's not enabled by default on them gor hangs), so better to just not do it as it may make the problem worse.
Alex
return true; } return false;
}
Thanks, Lijo
On 1/13/2022 9:31 AM, Alex Deucher wrote:
It can cause a hang. This is normally not enabled for GPU hangs on these asics, but was recently enabled for handling aborted suspends. This causes hangs on some platforms on suspend.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Cc: stable@vger.kernel.org Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1858 Signed-off-by: Alex Deucher alexander.deucher@amd.com
drivers/gpu/drm/amd/amdgpu/cik.c | 4 ++++ drivers/gpu/drm/amd/amdgpu/vi.c | 4 ++++ 2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c index 54f28c075f21..f10ce740a29c 100644 --- a/drivers/gpu/drm/amd/amdgpu/cik.c +++ b/drivers/gpu/drm/amd/amdgpu/cik.c @@ -1428,6 +1428,10 @@ static int cik_asic_reset(struct amdgpu_device *adev) { int r;
/* APUs don't have full asic reset */
if (adev->flags & AMD_IS_APU)
return 0;
if (cik_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c index fe9a7cc8d9eb..6645ebbd2696 100644 --- a/drivers/gpu/drm/amd/amdgpu/vi.c +++ b/drivers/gpu/drm/amd/amdgpu/vi.c @@ -956,6 +956,10 @@ static int vi_asic_reset(struct amdgpu_device *adev) { int r;
/* APUs don't have full asic reset */
if (adev->flags & AMD_IS_APU)
return 0;
if (vi_asic_reset_method(adev) == AMD_RESET_METHOD_BACO) { dev_info(adev->dev, "BACO reset\n"); r = amdgpu_dpm_baco_reset(adev);
linux-stable-mirror@lists.linaro.org