From: Simon Richter Simon.Richter@hogyros.de
[ Upstream commit b85bb2d677153d990924d31be9416166d22382eb ]
If PAGE_SIZE != XE_PAGE_SIZE (which is currently locked behind CONFIG_BROKEN), this would generate the wrong number of PDEs.
Since these PDEs are consumed by the GPU, the GPU page size needs to be used.
Signed-off-by: Simon Richter Simon.Richter@hogyros.de Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matthew Brost matthew.brost@intel.com Link: https://lore.kernel.org/r/20250818064806.2835-1-Simon.Richter@hogyros.de Signed-off-by: Sasha Levin sashal@kernel.org ---
LLM Generated explanations, may be completely bogus:
YES
- What it fixes - The loop that writes PDEs uses the host `PAGE_SIZE` instead of the GPU page size `XE_PAGE_SIZE`, causing an incorrect PDE count when they differ. In 6.17.1, this is at drivers/gpu/drm/xe/xe_migrate.c:292: - Current: for (i = 0; i < map_ofs / PAGE_SIZE; i++) { - Intended: for (i = 0; i < map_ofs / XE_PAGE_SIZE; i++) { - The PDEs are consumed by the GPU and the offsets encoded for each entry already use `XE_PAGE_SIZE` (drivers/gpu/drm/xe/xe_migrate.c:293–297), so the loop bound must match that unit.
- Why it matters - When `PAGE_SIZE != XE_PAGE_SIZE` (e.g., 64K host pages vs 4K GPU pages), the loop iterates too few times (by a factor of `PAGE_SIZE / XE_PAGE_SIZE`), leaving a large portion of PDEs unwritten. That results in incomplete page table coverage and GPU faults/hangs when accessing those unmapped regions. The fix enforces GPU page granularity for the loop count, which is the only correct interpretation since the GPU page tables and the offsets (i * XE_PAGE_SIZE) are in GPU page units. - The rest of the function already treats `map_ofs` in GPU page units: - PDE setup for upper levels uses `XE_PAGE_SIZE` (drivers/gpu/drm/xe/xe_migrate.c:285–288). - The VM suballocator capacity is computed with `map_ofs / XE_PAGE_SIZE` (drivers/gpu/drm/xe/xe_migrate.c:356–357). - This change removes an inconsistency within the same function and aligns the loop with how `map_ofs` is used elsewhere.
- Scope and risk - One-line change, confined to xe migrate VM setup (`xe_migrate_prepare_vm()`), no API or architectural changes. - On the common 4K-host-page configurations, `PAGE_SIZE == XE_PAGE_SIZE`, so behavior is identical. Risk of regression on mainstream builds is effectively zero. - On kernels where `PAGE_SIZE != XE_PAGE_SIZE`, it fixes real misprogramming of PDEs that can manifest as GPU page faults/hangs.
- Current gating and impact - `DRM_XE` Kconfig currently depends on `PAGE_SIZE_4KB || COMPILE_TEST || BROKEN` (drivers/gpu/drm/xe/Kconfig: depends on PAGE_SIZE_4KB || COMPILE_TEST || BROKEN). The commit message notes this path is presently behind `CONFIG_BROKEN`. Even so, this is a correctness bug that becomes user-visible as soon as non-4K is enabled, and it is harmless on 4K systems.
- Stable criteria - Fixes a clear bug in page table programming that can affect users when the constraint is relaxed or under non-4K configurations. - Minimal, well-contained change with no feature additions, and no architectural rewrites. - No adverse side effects; only enforces correct unit semantics. - Reviewed by xe maintainers according to the commit tags.
Summary: Replace `map_ofs / PAGE_SIZE` with `map_ofs / XE_PAGE_SIZE` in the PDE emission loop (drivers/gpu/drm/xe/xe_migrate.c:292) to make the loop’s unit consistent with GPU page size and the rest of the function’s logic. This is an obvious, low-risk bugfix suitable for stable backport.
drivers/gpu/drm/xe/xe_migrate.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 9b1e3dce1aea3..2a627ed64b8f8 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -291,7 +291,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, }
/* Write PDE's that point to our BO. */ - for (i = 0; i < map_ofs / PAGE_SIZE; i++) { + for (i = 0; i < map_ofs / XE_PAGE_SIZE; i++) { entry = vm->pt_ops->pde_encode_bo(bo, (u64)i * XE_PAGE_SIZE);
xe_map_wr(xe, &bo->vmap, map_ofs + XE_PAGE_SIZE +