On Thu, 2025-08-28 at 17:27 +0100, Matthew Auld wrote:
On 28/08/2025 17:06, Thomas Hellström wrote:
Hi,
On Thu, 2025-08-28 at 16:59 +0100, Matthew Auld wrote:
On 28/08/2025 16:42, Thomas Hellström wrote:
VRAM+TT bos that are evicted from VRAM to TT may remain in TT also after a revalidation following eviction or suspend.
This manifests itself as applications becoming sluggish after buffer objects get evicted or after a resume from suspend or hibernation.
If the bo supports placement in both VRAM and TT, and we are on DGFX, mark the TT placement as fallback. This means that it is tried only after VRAM + eviction.
This flaw has probably been present since the xe module was upstreamed but use a Fixes: commit below where backporting is likely to be simple. For earlier versions we need to open- code the fallback algorithm in the driver.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995 Fixes: a78a8da51b36 ("drm/ttm: replace busy placement with flags v6") Cc: Matthew Brost matthew.brost@intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: stable@vger.kernel.org # v6.9+ Signed-off-by: Thomas Hellström
thomas.hellstrom@linux.intel.com
drivers/gpu/drm/xe/xe_bo.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_bo.c b/drivers/gpu/drm/xe/xe_bo.c index 4faf15d5fa6d..64dea4e478bd 100644 --- a/drivers/gpu/drm/xe/xe_bo.c +++ b/drivers/gpu/drm/xe/xe_bo.c @@ -188,6 +188,8 @@ static void try_add_system(struct xe_device *xe, struct xe_bo *bo, bo->placements[*c] = (struct ttm_place) { .mem_type = XE_PL_TT,
.flags = (IS_DGFX(xe) && (bo_flags &
XE_BO_FLAG_VRAM_MASK)) ?
I suppose we could drop the dgfx check here?
Thanks for reviewing. From a quick look it looks like the VRAM_MASK bits can be set also on IGFX? And if so, then it's not ideal to mark the primary placement as FALLBACK. But I might have missed a rejection somewhere.
I was sweating bullets for a second there, but it looks like it gets rejected in the ioctl with:
if (XE_IOCTL_DBG(xe, (args->placement & ~xe->info.mem_region_mask) return -EINVAL;
The flags get converted from the args->placement, and VRAM should never appear in the mem_region_mask on igpu. If we allowed it I think it would crash in add_vram() since the vram manager does not exist so ttm_manager_type() would be NULL, AFAICT.
Thanks. Right, I'll spin a v2 and drop the IS_DGFX() test.
/Thomas
/Thomas
Either way, Reviewed-by: Matthew Auld matthew.auld@intel.com
TTM_PL_FLAG_FALLBACK : 0,
}; *c += 1; }