This patch fixes the VRAM BO eviction issue during resume when playing the steam game cuphead.
During psp resume, it requests a VRAM buffer of size 10240 KiB for the trusted memory region, as part of this memory allocation we are trying to evict few user buffers from VRAM to SYSTEM domain, the eviction process fails as the selected resource doesn't have contiguous blocks. Hence, the TMR memory request fails and the system stuck at resume process.
This change will skip the resource which has non-contiguous blocks and goes to the next available resource until it finds the contiguous blocks resource and moves the resource from VRAM to SYSTEM domain and proceed for the successful TMR allocation in VRAM and thus system comes out of resume process.
v2: - Added issue link and fixes tag.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2213 Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: Arunpravin Paneer Selvam Arunpravin.PaneerSelvam@amd.com Cc: stable@vger.kernel.org #6.0 --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index aea8d26b1724..1964de6ac997 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1369,6 +1369,10 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo, amdgpu_bo_encrypted(ttm_to_amdgpu_bo(bo))) return false;
+ if (bo->resource->mem_type == TTM_PL_VRAM && + !(bo->resource->placement & TTM_PL_FLAG_CONTIGUOUS)) + return false; + return ttm_bo_eviction_valuable(bo, place); }
Am 16.11.22 um 06:47 schrieb Arunpravin Paneer Selvam:
This patch fixes the VRAM BO eviction issue during resume when playing the steam game cuphead.
During psp resume, it requests a VRAM buffer of size 10240 KiB for the trusted memory region, as part of this memory allocation we are trying to evict few user buffers from VRAM to SYSTEM domain, the eviction process fails as the selected resource doesn't have contiguous blocks. Hence, the TMR memory request fails and the system stuck at resume process.
This change will skip the resource which has non-contiguous blocks and goes to the next available resource until it finds the contiguous blocks resource and moves the resource from VRAM to SYSTEM domain and proceed for the successful TMR allocation in VRAM and thus system comes out of resume process.
Well quite a big NAK to this.
Eviction of not contiguous allocations is perfectly possible, it's just not supposed to happen during resume when the DMA which is supposed to do that is not available.
The fundamental problem is that the PSP code frees and re-allocates the TMR during suspend/resume. This is absolutely not supposed to happen.
I'm going to propose a WARN_ON() to prevent subsystems from doing that. And I strongly suggest to fix the PSP code instead.
Regards, Christian.
v2:
- Added issue link and fixes tag.
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2213 Fixes: c9cad937c0c5 ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: Arunpravin Paneer Selvam Arunpravin.PaneerSelvam@amd.com Cc: stable@vger.kernel.org #6.0
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index aea8d26b1724..1964de6ac997 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c @@ -1369,6 +1369,10 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo, amdgpu_bo_encrypted(ttm_to_amdgpu_bo(bo))) return false;
- if (bo->resource->mem_type == TTM_PL_VRAM &&
!(bo->resource->placement & TTM_PL_FLAG_CONTIGUOUS))
return false;
- return ttm_bo_eviction_valuable(bo, place); }
linux-stable-mirror@lists.linaro.org