If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker.
Fix this by using ttm_tt_is_swapped() to identify backed-up objects.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost matthew.brost@intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com --- drivers/gpu/drm/xe/xe_bo.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 7d1ff642b02a..4faf15d5fa6d 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -823,8 +823,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, return ret; }
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) || - (ttm->page_flags & TTM_TT_FLAG_SWAPPED)); + tt_has_data = ttm && (ttm_tt_is_populated(ttm) || ttm_tt_is_swapped(ttm));
move_lacks_source = !old_mem || (handle_system_ccs ? (!bo->ccs_cleared) : (!mem_type_is_vram(old_mem_type) && !tt_has_data));
On 28/08/2025 14:48, Thomas Hellström wrote:
If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker.
Fix this by using ttm_tt_is_swapped() to identify backed-up objects.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost matthew.brost@intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
I guess we are missing some test coverage here?
Reviewed-by: Matthew Auld matthew.auld@intel.com
drivers/gpu/drm/xe/xe_bo.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 7d1ff642b02a..4faf15d5fa6d 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -823,8 +823,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, return ret; }
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) ||
(ttm->page_flags & TTM_TT_FLAG_SWAPPED));
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) || ttm_tt_is_swapped(ttm));
move_lacks_source = !old_mem || (handle_system_ccs ? (!bo->ccs_cleared) : (!mem_type_is_vram(old_mem_type) && !tt_has_data));
Hi, Matthew
On Thu, 2025-08-28 at 15:00 +0100, Matthew Auld wrote:
On 28/08/2025 14:48, Thomas Hellström wrote:
If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker.
Fix this by using ttm_tt_is_swapped() to identify backed-up objects.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: 00c8efc3180f ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost matthew.brost@intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: stable@vger.kernel.org # v6.15+ Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com
I guess we are missing some test coverage here?
Indeed. A bit embarrassing this has gone unnoticed for so long.
Reviewed-by: Matthew Auld matthew.auld@intel.com
Thanks for reviewing! /Thomas
drivers/gpu/drm/xe/xe_bo.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 7d1ff642b02a..4faf15d5fa6d 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -823,8 +823,7 @@ static int xe_bo_move(struct ttm_buffer_object *ttm_bo, bool evict, return ret; }
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) ||
(ttm->page_flags &
TTM_TT_FLAG_SWAPPED));
- tt_has_data = ttm && (ttm_tt_is_populated(ttm) ||
ttm_tt_is_swapped(ttm)); move_lacks_source = !old_mem || (handle_system_ccs ? (!bo-
ccs_cleared) :
(!mem_type_is_vram(old_mem_type) && !tt_has_data));
linux-stable-mirror@lists.linaro.org