Due to some issues with hibernation on Lunar Lake (integrated), it was decided to re-use the migration logic from Battle Mage (discrete). However in 6.11 there were several patches missing to allow that to work. A few patches were picked automatically for 6.11.10, but they are not sufficient. Bring the additional patches and some tests to make sure the backports work: this correspond to 20 of the patches here. Others were additional fixes or dependencies.
This was tested on top of 6.11.10.
Akshata Jahagirdar (5): drm/xe/migrate: Handle clear ccs logic for xe2 dgfx drm/xe/migrate: Add helper function to program identity map drm/xe/migrate: Add kunit to test clear functionality drm/xe/xe2: Introduce identity map for compressed pat for vram drm/xe/xe_migrate: Handle migration logic for xe2+ dgfx
Aradhya Bhatia (1): drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg
Chaitanya Kumar Borah (1): drm/i915: Do not explicilty enable FEC in DP_TP_CTL for UHBR rates
Daniele Ceraolo Spurio (1): drm/xe/uc: Use managed bo for HuC and GSC objects
Gustavo Sousa (2): drm/xe/xe2: Extend performance tuning to media GT drm/xe/xe2: Add performance tuning for L3 cache flushing
He Lugang (1): drm/xe: use devm_add_action_or_reset() helper
Imre Deak (5): drm/xe: Handle polling only for system s/r in xe_display_pm_suspend/resume() drm/i915/dp: Assume panel power is off if runtime suspended drm/i915/dp: Disable unnecessary HPD polling for eDP drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume
Maarten Lankhorst (2): drm/xe: Remove runtime argument from display s/r functions drm/xe: Fix missing conversion to xe_display_pm_runtime_resume
Matthew Auld (3): drm/xe/client: use mem_type from the current resource drm/xe/queue: move xa_alloc to prevent UAF drm/xe/bmg: improve cache flushing behaviour
Matthew Brost (1): drm/xe: Do not run GPU page fault handler on a closed VM
Michal Wajdeczko (4): drm/xe/kunit: Kill xe_cur_kunit() drm/xe/kunit: Simplify xe_bo live tests code layout drm/xe/kunit: Simplify xe_dma_buf live tests code layout drm/xe/kunit: Simplify xe_migrate live tests code layout
Rodrigo Vivi (1): drm/{i915, xe}: Avoid direct inspection of dpt_vma from outside dpt
Suraj Kandpal (2): drm/xe/display: Do not suspend resume dp mst during runtime drm/xe/display: Do not do intel_fbdev_set_suspend during runtime
Thomas Hellström (1): drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices
Vinod Govindapillai (1): drm/xe/display: handle HPD polling in display runtime suspend/resume
drivers/gpu/drm/i915/display/intel_dp.c | 16 +- drivers/gpu/drm/i915/display/intel_dpt.c | 4 + drivers/gpu/drm/i915/display/intel_dpt.h | 3 + .../drm/i915/display/skl_universal_plane.c | 3 +- drivers/gpu/drm/i915/intel_runtime_pm.h | 8 +- .../xe/compat-i915-headers/intel_runtime_pm.h | 8 + drivers/gpu/drm/xe/display/xe_display.c | 78 ++++- drivers/gpu/drm/xe/display/xe_display.h | 12 +- drivers/gpu/drm/xe/display/xe_fb_pin.c | 9 +- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 12 +- drivers/gpu/drm/xe/tests/Makefile | 3 - drivers/gpu/drm/xe/tests/xe_bo.c | 24 +- drivers/gpu/drm/xe/tests/xe_bo_test.c | 21 -- drivers/gpu/drm/xe/tests/xe_bo_test.h | 14 - drivers/gpu/drm/xe/tests/xe_dma_buf.c | 20 +- drivers/gpu/drm/xe/tests/xe_dma_buf_test.c | 20 -- drivers/gpu/drm/xe/tests/xe_dma_buf_test.h | 13 - drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 9 + drivers/gpu/drm/xe/tests/xe_migrate.c | 299 +++++++++++++++++- drivers/gpu/drm/xe/tests/xe_migrate_test.c | 20 -- drivers/gpu/drm/xe/tests/xe_migrate_test.h | 13 - drivers/gpu/drm/xe/tests/xe_mocs.c | 8 +- drivers/gpu/drm/xe/tests/xe_pci_test.c | 4 +- drivers/gpu/drm/xe/tests/xe_test.h | 8 +- drivers/gpu/drm/xe/xe_drm_client.c | 7 +- drivers/gpu/drm/xe/xe_exec_queue.c | 4 +- drivers/gpu/drm/xe/xe_gsc.c | 12 +- drivers/gpu/drm/xe/xe_gsc_proxy.c | 36 +-- drivers/gpu/drm/xe/xe_gt.c | 1 - drivers/gpu/drm/xe/xe_gt_freq.c | 4 +- drivers/gpu/drm/xe/xe_gt_pagefault.c | 6 + drivers/gpu/drm/xe/xe_gt_sysfs.c | 2 +- drivers/gpu/drm/xe/xe_huc.c | 19 +- drivers/gpu/drm/xe/xe_migrate.c | 185 +++++++---- drivers/gpu/drm/xe/xe_module.c | 9 + drivers/gpu/drm/xe/xe_pm.c | 100 ++++-- drivers/gpu/drm/xe/xe_pm.h | 1 + drivers/gpu/drm/xe/xe_tuning.c | 28 ++ drivers/gpu/drm/xe/xe_wa.c | 4 + 39 files changed, 735 insertions(+), 312 deletions(-) delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.h delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.h
From: Akshata Jahagirdar akshata.jahagirdar@intel.com
commit 108c972a11c5f6e37be58207460d9bcac06698db upstream.
For Xe2 dGPU, we clear the bo by modifying the VRAM using an uncompressed pat index which then indirectly updates the compression status as uncompressed i.e zeroed CCS. So xe_migrate_clear() should be updated for BMG to not emit CCS surf copy commands.
v2: Moved xe_device_needs_ccs_emit() to xe_migrate.c and changed name to xe_migrate_needs_ccs_emit() since its very specific to migration.(Matt)
Signed-off-by: Akshata Jahagirdar akshata.jahagirdar@intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/8dd869dd8dda5e17ace28c04f1a486... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_migrate.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index a849c48d8ac90..8315cb02f370d 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -348,6 +348,11 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt) return logical_mask; }
+static bool xe_migrate_needs_ccs_emit(struct xe_device *xe) +{ + return xe_device_has_flat_ccs(xe) && !(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe)); +} + /** * xe_migrate_init() - Initialize a migrate context * @tile: Back-pointer to the tile we're initializing for. @@ -421,7 +426,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile) return ERR_PTR(err);
if (IS_DGFX(xe)) { - if (xe_device_has_flat_ccs(xe)) + if (xe_migrate_needs_ccs_emit(xe)) /* min chunk size corresponds to 4K of CCS Metadata */ m->min_chunk_size = SZ_4K * SZ_64K / xe_device_ccs_bytes(xe, SZ_64K); @@ -1035,7 +1040,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0, avail_pts);
- if (xe_device_has_flat_ccs(xe)) + if (xe_migrate_needs_ccs_emit(xe)) batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */ @@ -1063,7 +1068,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, if (!clear_system_ccs) emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
- if (xe_device_has_flat_ccs(xe)) { + if (xe_migrate_needs_ccs_emit(xe)) { emit_copy_ccs(gt, bb, clear_L0_ofs, true, m->cleared_mem_ofs, false, clear_L0); flush_flags = MI_FLUSH_DW_CCS;
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 108c972a11c5f6e37be58207460d9bcac06698db
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Akshata Jahagirdar akshata.jahagirdar@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 08:22:04.287621219 -0500 +++ /tmp/tmp.piLgrFj0iY 2024-11-23 08:22:04.278840217 -0500 @@ -1,3 +1,5 @@ +commit 108c972a11c5f6e37be58207460d9bcac06698db upstream. + For Xe2 dGPU, we clear the bo by modifying the VRAM using an uncompressed pat index which then indirectly updates the compression status as uncompressed i.e zeroed CCS. @@ -13,15 +15,16 @@ Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/8dd869dd8dda5e17ace28c04f1a486... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_migrate.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c -index fa23a7e7ec435..85eec95c9bc27 100644 +index a849c48d8ac90..8315cb02f370d 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c -@@ -347,6 +347,11 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt) +@@ -348,6 +348,11 @@ static u32 xe_migrate_usm_logical_mask(struct xe_gt *gt) return logical_mask; }
@@ -33,7 +36,7 @@ /** * xe_migrate_init() - Initialize a migrate context * @tile: Back-pointer to the tile we're initializing for. -@@ -420,7 +425,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile) +@@ -421,7 +426,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile) return ERR_PTR(err);
if (IS_DGFX(xe)) { @@ -42,7 +45,7 @@ /* min chunk size corresponds to 4K of CCS Metadata */ m->min_chunk_size = SZ_4K * SZ_64K / xe_device_ccs_bytes(xe, SZ_64K); -@@ -1034,7 +1039,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, +@@ -1035,7 +1040,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0, avail_pts);
@@ -51,7 +54,7 @@ batch_size += EMIT_COPY_CCS_DW;
/* Clear commands */ -@@ -1062,7 +1067,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, +@@ -1063,7 +1068,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, if (!clear_system_ccs) emit_clear(gt, bb, clear_L0_ofs, clear_L0, XE_PAGE_SIZE, clear_vram);
@@ -60,3 +63,6 @@ emit_copy_ccs(gt, bb, clear_L0_ofs, true, m->cleared_mem_ofs, false, clear_L0); flush_flags = MI_FLUSH_DW_CCS; +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Akshata Jahagirdar akshata.jahagirdar@intel.com
commit 8d79acd567db183e675cccc6cc737d2959e2a2d9 upstream.
Add an helper function to program identity map.
v2: Formatting nits
Signed-off-by: Akshata Jahagirdar akshata.jahagirdar@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/91dc05f05bd33076fb9a9f74f8495b... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_migrate.c | 88 ++++++++++++++++++--------------- 1 file changed, 48 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 8315cb02f370d..f1cdb6f1fa176 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -131,6 +131,51 @@ static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr) return addr + (256ULL << xe_pt_shift(2)); }
+static void xe_migrate_program_identity(struct xe_device *xe, struct xe_vm *vm, struct xe_bo *bo, + u64 map_ofs, u64 vram_offset, u16 pat_index, u64 pt_2m_ofs) +{ + u64 pos, ofs, flags; + u64 entry; + /* XXX: Unclear if this should be usable_size? */ + u64 vram_limit = xe->mem.vram.actual_physical_size + + xe->mem.vram.dpa_base; + u32 level = 2; + + ofs = map_ofs + XE_PAGE_SIZE * level + vram_offset * 8; + flags = vm->pt_ops->pte_encode_addr(xe, 0, pat_index, level, + true, 0); + + xe_assert(xe, IS_ALIGNED(xe->mem.vram.usable_size, SZ_2M)); + + /* + * Use 1GB pages when possible, last chunk always use 2M + * pages as mixing reserved memory (stolen, WOCPM) with a single + * mapping is not allowed on certain platforms. + */ + for (pos = xe->mem.vram.dpa_base; pos < vram_limit; + pos += SZ_1G, ofs += 8) { + if (pos + SZ_1G >= vram_limit) { + entry = vm->pt_ops->pde_encode_bo(bo, pt_2m_ofs, + pat_index); + xe_map_wr(xe, &bo->vmap, ofs, u64, entry); + + flags = vm->pt_ops->pte_encode_addr(xe, 0, + pat_index, + level - 1, + true, 0); + + for (ofs = pt_2m_ofs; pos < vram_limit; + pos += SZ_2M, ofs += 8) + xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags); + break; /* Ensure pos == vram_limit assert correct */ + } + + xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags); + } + + xe_assert(xe, pos == vram_limit); +} + static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, struct xe_vm *vm) { @@ -254,47 +299,10 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
/* Identity map the entire vram at 256GiB offset */ if (IS_DGFX(xe)) { - u64 pos, ofs, flags; - /* XXX: Unclear if this should be usable_size? */ - u64 vram_limit = xe->mem.vram.actual_physical_size + - xe->mem.vram.dpa_base; - - level = 2; - ofs = map_ofs + XE_PAGE_SIZE * level + 256 * 8; - flags = vm->pt_ops->pte_encode_addr(xe, 0, pat_index, level, - true, 0); - - xe_assert(xe, IS_ALIGNED(xe->mem.vram.usable_size, SZ_2M)); - - /* - * Use 1GB pages when possible, last chunk always use 2M - * pages as mixing reserved memory (stolen, WOCPM) with a single - * mapping is not allowed on certain platforms. - */ - for (pos = xe->mem.vram.dpa_base; pos < vram_limit; - pos += SZ_1G, ofs += 8) { - if (pos + SZ_1G >= vram_limit) { - u64 pt31_ofs = bo->size - XE_PAGE_SIZE; - - entry = vm->pt_ops->pde_encode_bo(bo, pt31_ofs, - pat_index); - xe_map_wr(xe, &bo->vmap, ofs, u64, entry); - - flags = vm->pt_ops->pte_encode_addr(xe, 0, - pat_index, - level - 1, - true, 0); - - for (ofs = pt31_ofs; pos < vram_limit; - pos += SZ_2M, ofs += 8) - xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags); - break; /* Ensure pos == vram_limit assert correct */ - } - - xe_map_wr(xe, &bo->vmap, ofs, u64, pos | flags); - } + u64 pt31_ofs = bo->size - XE_PAGE_SIZE;
- xe_assert(xe, pos == vram_limit); + xe_migrate_program_identity(xe, vm, bo, map_ofs, 256, pat_index, pt31_ofs); + xe_assert(xe, (xe->mem.vram.actual_physical_size <= SZ_256G)); }
/*
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 8d79acd567db183e675cccc6cc737d2959e2a2d9
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Akshata Jahagirdar akshata.jahagirdar@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 08:34:58.863867900 -0500 +++ /tmp/tmp.d91DYLc9D7 2024-11-23 08:34:58.856237487 -0500 @@ -1,3 +1,5 @@ +commit 8d79acd567db183e675cccc6cc737d2959e2a2d9 upstream. + Add an helper function to program identity map.
v2: Formatting nits @@ -6,15 +8,16 @@ Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/91dc05f05bd33076fb9a9f74f8495b... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_migrate.c | 88 ++++++++++++++++++--------------- 1 file changed, 48 insertions(+), 40 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c -index 85eec95c9bc27..49ad5d8443cf2 100644 +index 8315cb02f370d..f1cdb6f1fa176 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c -@@ -130,6 +130,51 @@ static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr) +@@ -131,6 +131,51 @@ static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr) return addr + (256ULL << xe_pt_shift(2)); }
@@ -66,7 +69,7 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, struct xe_vm *vm) { -@@ -253,47 +298,10 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, +@@ -254,47 +299,10 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
/* Identity map the entire vram at 256GiB offset */ if (IS_DGFX(xe)) { @@ -117,3 +120,6 @@ }
/* +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Michal Wajdeczko michal.wajdeczko@intel.com
commit bd85e00fa489f5374c2bad0eac15842d2ec68045 upstream.
We shouldn't use custom helper if there is a official one.
Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-2-michal.w... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/xe_bo.c | 4 ++-- drivers/gpu/drm/xe/tests/xe_dma_buf.c | 4 ++-- drivers/gpu/drm/xe/tests/xe_migrate.c | 2 +- drivers/gpu/drm/xe/tests/xe_mocs.c | 8 ++++---- drivers/gpu/drm/xe/tests/xe_pci_test.c | 4 ++-- drivers/gpu/drm/xe/tests/xe_test.h | 8 +++----- 6 files changed, 14 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c index 9f3c028264649..263e0afa8de0c 100644 --- a/drivers/gpu/drm/xe/tests/xe_bo.c +++ b/drivers/gpu/drm/xe/tests/xe_bo.c @@ -154,7 +154,7 @@ static void ccs_test_run_tile(struct xe_device *xe, struct xe_tile *tile,
static int ccs_test_run_device(struct xe_device *xe) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); struct xe_tile *tile; int id;
@@ -325,7 +325,7 @@ static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struc
static int evict_test_run_device(struct xe_device *xe) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); struct xe_tile *tile; int id;
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c index e7f9b531c4654..b56013963911e 100644 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c @@ -107,7 +107,7 @@ static void check_residency(struct kunit *test, struct xe_bo *exported,
static void xe_test_dmabuf_import_same_driver(struct xe_device *xe) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); struct dma_buf_test_params *params = to_dma_buf_test_params(test->priv); struct drm_gem_object *import; struct dma_buf *dmabuf; @@ -258,7 +258,7 @@ static const struct dma_buf_test_params test_params[] = { static int dma_buf_run_device(struct xe_device *xe) { const struct dma_buf_test_params *params; - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test();
xe_pm_runtime_get(xe); for (params = test_params; params->mem_mask; ++params) { diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c index 962f6438e2192..d277a21ccf910 100644 --- a/drivers/gpu/drm/xe/tests/xe_migrate.c +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c @@ -334,7 +334,7 @@ static void xe_migrate_sanity_test(struct xe_migrate *m, struct kunit *test)
static int migrate_test_run_device(struct xe_device *xe) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); struct xe_tile *tile; int id;
diff --git a/drivers/gpu/drm/xe/tests/xe_mocs.c b/drivers/gpu/drm/xe/tests/xe_mocs.c index 67c65e88c3845..4fff5de92dea1 100644 --- a/drivers/gpu/drm/xe/tests/xe_mocs.c +++ b/drivers/gpu/drm/xe/tests/xe_mocs.c @@ -23,7 +23,7 @@ struct live_mocs { static int live_mocs_init(struct live_mocs *arg, struct xe_gt *gt) { unsigned int flags; - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test();
memset(arg, 0, sizeof(*arg));
@@ -41,7 +41,7 @@ static int live_mocs_init(struct live_mocs *arg, struct xe_gt *gt) static void read_l3cc_table(struct xe_gt *gt, const struct xe_mocs_info *info) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); u32 l3cc, l3cc_expected; unsigned int i; u32 reg_val; @@ -78,7 +78,7 @@ static void read_l3cc_table(struct xe_gt *gt, static void read_mocs_table(struct xe_gt *gt, const struct xe_mocs_info *info) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); u32 mocs, mocs_expected; unsigned int i; u32 reg_val; @@ -148,7 +148,7 @@ static int mocs_reset_test_run_device(struct xe_device *xe) struct xe_gt *gt; unsigned int flags; int id; - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test();
xe_pm_runtime_get(xe);
diff --git a/drivers/gpu/drm/xe/tests/xe_pci_test.c b/drivers/gpu/drm/xe/tests/xe_pci_test.c index a6705a536391d..744a37583d2d7 100644 --- a/drivers/gpu/drm/xe/tests/xe_pci_test.c +++ b/drivers/gpu/drm/xe/tests/xe_pci_test.c @@ -16,7 +16,7 @@
static void check_graphics_ip(const struct xe_graphics_desc *graphics) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); u64 mask = graphics->hw_engine_mask;
/* RCS, CCS, and BCS engines are allowed on the graphics IP */ @@ -30,7 +30,7 @@ static void check_graphics_ip(const struct xe_graphics_desc *graphics)
static void check_media_ip(const struct xe_media_desc *media) { - struct kunit *test = xe_cur_kunit(); + struct kunit *test = kunit_get_current_test(); u64 mask = media->hw_engine_mask;
/* VCS, VECS and GSCCS engines are allowed on the media IP */ diff --git a/drivers/gpu/drm/xe/tests/xe_test.h b/drivers/gpu/drm/xe/tests/xe_test.h index 7a1ae213e750a..55e5b5bedccc6 100644 --- a/drivers/gpu/drm/xe/tests/xe_test.h +++ b/drivers/gpu/drm/xe/tests/xe_test.h @@ -9,8 +9,8 @@ #include <linux/types.h>
#if IS_ENABLED(CONFIG_DRM_XE_KUNIT_TEST) -#include <linux/sched.h> #include <kunit/test.h> +#include <kunit/test-bug.h>
/* * Each test that provides a kunit private test structure, place a test id @@ -32,7 +32,6 @@ struct xe_test_priv { #define XE_TEST_DECLARE(x) x #define XE_TEST_ONLY(x) unlikely(x) #define XE_TEST_EXPORT -#define xe_cur_kunit() current->kunit_test
/** * xe_cur_kunit_priv - Obtain the struct xe_test_priv pointed to by @@ -48,10 +47,10 @@ xe_cur_kunit_priv(enum xe_test_priv_id id) { struct xe_test_priv *priv;
- if (!xe_cur_kunit()) + if (!kunit_get_current_test()) return NULL;
- priv = xe_cur_kunit()->priv; + priv = kunit_get_current_test()->priv; return priv->id == id ? priv : NULL; }
@@ -60,7 +59,6 @@ xe_cur_kunit_priv(enum xe_test_priv_id id) #define XE_TEST_DECLARE(x) #define XE_TEST_ONLY(x) 0 #define XE_TEST_EXPORT static -#define xe_cur_kunit() NULL #define xe_cur_kunit_priv(_id) NULL
#endif
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: bd85e00fa489f5374c2bad0eac15842d2ec68045
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Michal Wajdeczko michal.wajdeczko@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 08:44:57.402624667 -0500 +++ /tmp/tmp.fA20PN065W 2024-11-23 08:44:57.395856119 -0500 @@ -1,8 +1,11 @@ +commit bd85e00fa489f5374c2bad0eac15842d2ec68045 upstream. + We shouldn't use custom helper if there is a official one.
Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240705191057.1110-2-michal.w... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/xe_bo.c | 4 ++-- drivers/gpu/drm/xe/tests/xe_dma_buf.c | 4 ++-- @@ -174,3 +177,6 @@ #define xe_cur_kunit_priv(_id) NULL
#endif +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Michal Wajdeczko michal.wajdeczko@intel.com
commit d6e850acc716d0fad756f09488d198db2077141e upstream.
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module.
But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module.
Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-2-michal.w... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_bo.c | 20 +++++++++++++++----- drivers/gpu/drm/xe/tests/xe_bo_test.c | 21 --------------------- drivers/gpu/drm/xe/tests/xe_bo_test.h | 14 -------------- drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 5 +++++ 5 files changed, 20 insertions(+), 41 deletions(-) delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_bo_test.h
diff --git a/drivers/gpu/drm/xe/tests/Makefile b/drivers/gpu/drm/xe/tests/Makefile index 6e58931fddd44..77331b0a04ad2 100644 --- a/drivers/gpu/drm/xe/tests/Makefile +++ b/drivers/gpu/drm/xe/tests/Makefile @@ -3,7 +3,6 @@ # "live" kunit tests obj-$(CONFIG_DRM_XE_KUNIT_TEST) += xe_live_test.o xe_live_test-y = xe_live_test_mod.o \ - xe_bo_test.o \ xe_dma_buf_test.o \ xe_migrate_test.o \ xe_mocs_test.o diff --git a/drivers/gpu/drm/xe/tests/xe_bo.c b/drivers/gpu/drm/xe/tests/xe_bo.c index 263e0afa8de0c..692e1b46b9cf9 100644 --- a/drivers/gpu/drm/xe/tests/xe_bo.c +++ b/drivers/gpu/drm/xe/tests/xe_bo.c @@ -6,7 +6,6 @@ #include <kunit/test.h> #include <kunit/visibility.h>
-#include "tests/xe_bo_test.h" #include "tests/xe_pci_test.h" #include "tests/xe_test.h"
@@ -177,11 +176,10 @@ static int ccs_test_run_device(struct xe_device *xe) return 0; }
-void xe_ccs_migrate_kunit(struct kunit *test) +static void xe_ccs_migrate_kunit(struct kunit *test) { xe_call_for_each_device(ccs_test_run_device); } -EXPORT_SYMBOL_IF_KUNIT(xe_ccs_migrate_kunit);
static int evict_test_run_tile(struct xe_device *xe, struct xe_tile *tile, struct kunit *test) { @@ -345,8 +343,20 @@ static int evict_test_run_device(struct xe_device *xe) return 0; }
-void xe_bo_evict_kunit(struct kunit *test) +static void xe_bo_evict_kunit(struct kunit *test) { xe_call_for_each_device(evict_test_run_device); } -EXPORT_SYMBOL_IF_KUNIT(xe_bo_evict_kunit); + +static struct kunit_case xe_bo_tests[] = { + KUNIT_CASE(xe_ccs_migrate_kunit), + KUNIT_CASE(xe_bo_evict_kunit), + {} +}; + +VISIBLE_IF_KUNIT +struct kunit_suite xe_bo_test_suite = { + .name = "xe_bo", + .test_cases = xe_bo_tests, +}; +EXPORT_SYMBOL_IF_KUNIT(xe_bo_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_bo_test.c b/drivers/gpu/drm/xe/tests/xe_bo_test.c deleted file mode 100644 index a324cde77db82..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_bo_test.c +++ /dev/null @@ -1,21 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright © 2022 Intel Corporation - */ - -#include "xe_bo_test.h" - -#include <kunit/test.h> - -static struct kunit_case xe_bo_tests[] = { - KUNIT_CASE(xe_ccs_migrate_kunit), - KUNIT_CASE(xe_bo_evict_kunit), - {} -}; - -static struct kunit_suite xe_bo_test_suite = { - .name = "xe_bo", - .test_cases = xe_bo_tests, -}; - -kunit_test_suite(xe_bo_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_bo_test.h b/drivers/gpu/drm/xe/tests/xe_bo_test.h deleted file mode 100644 index 0113ab45066a4..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_bo_test.h +++ /dev/null @@ -1,14 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 AND MIT */ -/* - * Copyright © 2023 Intel Corporation - */ - -#ifndef _XE_BO_TEST_H_ -#define _XE_BO_TEST_H_ - -struct kunit; - -void xe_ccs_migrate_kunit(struct kunit *test); -void xe_bo_evict_kunit(struct kunit *test); - -#endif diff --git a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c index eb1ea99a5a8b1..3bffcbd233b29 100644 --- a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c +++ b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c @@ -3,6 +3,11 @@ * Copyright © 2023 Intel Corporation */ #include <linux/module.h> +#include <kunit/test.h> + +extern struct kunit_suite xe_bo_test_suite; + +kunit_test_suite(xe_bo_test_suite);
MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL");
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: d6e850acc716d0fad756f09488d198db2077141e
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Michal Wajdeczko michal.wajdeczko@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 08:54:18.968301678 -0500 +++ /tmp/tmp.kTGzsMN7DH 2024-11-23 08:54:18.959956694 -0500 @@ -1,3 +1,5 @@ +commit d6e850acc716d0fad756f09488d198db2077141e upstream. + The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. @@ -10,6 +12,7 @@ Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-2-michal.w... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_bo.c | 20 +++++++++++++++----- @@ -143,3 +146,6 @@
MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL"); +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Michal Wajdeczko michal.wajdeczko@intel.com
commit ff10c99ab1e644fed578dce13e94e372d2c688c3 upstream.
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module.
But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module.
Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-3-michal.w... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_dma_buf.c | 16 +++++++++++++--- drivers/gpu/drm/xe/tests/xe_dma_buf_test.c | 20 -------------------- drivers/gpu/drm/xe/tests/xe_dma_buf_test.h | 13 ------------- drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 2 ++ 5 files changed, 15 insertions(+), 37 deletions(-) delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_dma_buf_test.h
diff --git a/drivers/gpu/drm/xe/tests/Makefile b/drivers/gpu/drm/xe/tests/Makefile index 77331b0a04ad2..c77a5882d094e 100644 --- a/drivers/gpu/drm/xe/tests/Makefile +++ b/drivers/gpu/drm/xe/tests/Makefile @@ -3,7 +3,6 @@ # "live" kunit tests obj-$(CONFIG_DRM_XE_KUNIT_TEST) += xe_live_test.o xe_live_test-y = xe_live_test_mod.o \ - xe_dma_buf_test.o \ xe_migrate_test.o \ xe_mocs_test.o
diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf.c b/drivers/gpu/drm/xe/tests/xe_dma_buf.c index b56013963911e..4f9dc41e13de9 100644 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf.c +++ b/drivers/gpu/drm/xe/tests/xe_dma_buf.c @@ -8,7 +8,6 @@ #include <kunit/test.h> #include <kunit/visibility.h>
-#include "tests/xe_dma_buf_test.h" #include "tests/xe_pci_test.h"
#include "xe_pci.h" @@ -274,8 +273,19 @@ static int dma_buf_run_device(struct xe_device *xe) return 0; }
-void xe_dma_buf_kunit(struct kunit *test) +static void xe_dma_buf_kunit(struct kunit *test) { xe_call_for_each_device(dma_buf_run_device); } -EXPORT_SYMBOL_IF_KUNIT(xe_dma_buf_kunit); + +static struct kunit_case xe_dma_buf_tests[] = { + KUNIT_CASE(xe_dma_buf_kunit), + {} +}; + +VISIBLE_IF_KUNIT +struct kunit_suite xe_dma_buf_test_suite = { + .name = "xe_dma_buf", + .test_cases = xe_dma_buf_tests, +}; +EXPORT_SYMBOL_IF_KUNIT(xe_dma_buf_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf_test.c b/drivers/gpu/drm/xe/tests/xe_dma_buf_test.c deleted file mode 100644 index 99cdb718b6c61..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf_test.c +++ /dev/null @@ -1,20 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright © 2022 Intel Corporation - */ - -#include "xe_dma_buf_test.h" - -#include <kunit/test.h> - -static struct kunit_case xe_dma_buf_tests[] = { - KUNIT_CASE(xe_dma_buf_kunit), - {} -}; - -static struct kunit_suite xe_dma_buf_test_suite = { - .name = "xe_dma_buf", - .test_cases = xe_dma_buf_tests, -}; - -kunit_test_suite(xe_dma_buf_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_dma_buf_test.h b/drivers/gpu/drm/xe/tests/xe_dma_buf_test.h deleted file mode 100644 index e6b464ddd5260..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_dma_buf_test.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 AND MIT */ -/* - * Copyright © 2023 Intel Corporation - */ - -#ifndef _XE_DMA_BUF_TEST_H_ -#define _XE_DMA_BUF_TEST_H_ - -struct kunit; - -void xe_dma_buf_kunit(struct kunit *test); - -#endif diff --git a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c index 3bffcbd233b29..d9da15d9fe3fd 100644 --- a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c +++ b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c @@ -6,8 +6,10 @@ #include <kunit/test.h>
extern struct kunit_suite xe_bo_test_suite; +extern struct kunit_suite xe_dma_buf_test_suite;
kunit_test_suite(xe_bo_test_suite); +kunit_test_suite(xe_dma_buf_test_suite);
MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL");
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: ff10c99ab1e644fed578dce13e94e372d2c688c3
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Michal Wajdeczko michal.wajdeczko@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 09:03:30.172146718 -0500 +++ /tmp/tmp.4Fq75jAh8B 2024-11-23 09:03:30.168217727 -0500 @@ -1,3 +1,5 @@ +commit ff10c99ab1e644fed578dce13e94e372d2c688c3 upstream. + The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. @@ -10,6 +12,7 @@ Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-3-michal.w... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_dma_buf.c | 16 +++++++++++++--- @@ -126,3 +129,6 @@
MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL"); +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Michal Wajdeczko michal.wajdeczko@intel.com
commit 0237368193e897aadeea9801126c101e33047354 upstream.
The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module.
But we don't need to export individual test case functions, we may just export the entire test suite. And we don't need to register this test suite in a separate file, it can be done in the main file of the live test module.
Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-4-michal.w... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 2 ++ drivers/gpu/drm/xe/tests/xe_migrate.c | 16 +++++++++++++--- drivers/gpu/drm/xe/tests/xe_migrate_test.c | 20 -------------------- drivers/gpu/drm/xe/tests/xe_migrate_test.h | 13 ------------- 5 files changed, 15 insertions(+), 37 deletions(-) delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.c delete mode 100644 drivers/gpu/drm/xe/tests/xe_migrate_test.h
diff --git a/drivers/gpu/drm/xe/tests/Makefile b/drivers/gpu/drm/xe/tests/Makefile index c77a5882d094e..32ce1d6df0fa0 100644 --- a/drivers/gpu/drm/xe/tests/Makefile +++ b/drivers/gpu/drm/xe/tests/Makefile @@ -3,7 +3,6 @@ # "live" kunit tests obj-$(CONFIG_DRM_XE_KUNIT_TEST) += xe_live_test.o xe_live_test-y = xe_live_test_mod.o \ - xe_migrate_test.o \ xe_mocs_test.o
# Normal kunit tests diff --git a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c index d9da15d9fe3fd..4c1e07a0d4778 100644 --- a/drivers/gpu/drm/xe/tests/xe_live_test_mod.c +++ b/drivers/gpu/drm/xe/tests/xe_live_test_mod.c @@ -7,9 +7,11 @@
extern struct kunit_suite xe_bo_test_suite; extern struct kunit_suite xe_dma_buf_test_suite; +extern struct kunit_suite xe_migrate_test_suite;
kunit_test_suite(xe_bo_test_suite); kunit_test_suite(xe_dma_buf_test_suite); +kunit_test_suite(xe_migrate_test_suite);
MODULE_AUTHOR("Intel Corporation"); MODULE_LICENSE("GPL"); diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c index d277a21ccf910..0de0e0c666230 100644 --- a/drivers/gpu/drm/xe/tests/xe_migrate.c +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c @@ -6,7 +6,6 @@ #include <kunit/test.h> #include <kunit/visibility.h>
-#include "tests/xe_migrate_test.h" #include "tests/xe_pci_test.h"
#include "xe_pci.h" @@ -354,8 +353,19 @@ static int migrate_test_run_device(struct xe_device *xe) return 0; }
-void xe_migrate_sanity_kunit(struct kunit *test) +static void xe_migrate_sanity_kunit(struct kunit *test) { xe_call_for_each_device(migrate_test_run_device); } -EXPORT_SYMBOL_IF_KUNIT(xe_migrate_sanity_kunit); + +static struct kunit_case xe_migrate_tests[] = { + KUNIT_CASE(xe_migrate_sanity_kunit), + {} +}; + +VISIBLE_IF_KUNIT +struct kunit_suite xe_migrate_test_suite = { + .name = "xe_migrate", + .test_cases = xe_migrate_tests, +}; +EXPORT_SYMBOL_IF_KUNIT(xe_migrate_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_migrate_test.c b/drivers/gpu/drm/xe/tests/xe_migrate_test.c deleted file mode 100644 index eb0d8963419cb..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_migrate_test.c +++ /dev/null @@ -1,20 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -/* - * Copyright © 2022 Intel Corporation - */ - -#include "xe_migrate_test.h" - -#include <kunit/test.h> - -static struct kunit_case xe_migrate_tests[] = { - KUNIT_CASE(xe_migrate_sanity_kunit), - {} -}; - -static struct kunit_suite xe_migrate_test_suite = { - .name = "xe_migrate", - .test_cases = xe_migrate_tests, -}; - -kunit_test_suite(xe_migrate_test_suite); diff --git a/drivers/gpu/drm/xe/tests/xe_migrate_test.h b/drivers/gpu/drm/xe/tests/xe_migrate_test.h deleted file mode 100644 index 7c645c66824f8..0000000000000 --- a/drivers/gpu/drm/xe/tests/xe_migrate_test.h +++ /dev/null @@ -1,13 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 AND MIT */ -/* - * Copyright © 2023 Intel Corporation - */ - -#ifndef _XE_MIGRATE_TEST_H_ -#define _XE_MIGRATE_TEST_H_ - -struct kunit; - -void xe_migrate_sanity_kunit(struct kunit *test); - -#endif
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 0237368193e897aadeea9801126c101e33047354
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Michal Wajdeczko michal.wajdeczko@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 09:14:03.002685686 -0500 +++ /tmp/tmp.D6Q92Tn9Y4 2024-11-23 09:14:02.995375557 -0500 @@ -1,3 +1,5 @@ +commit 0237368193e897aadeea9801126c101e33047354 upstream. + The test case logic is implemented by the functions compiled as part of the core Xe driver module and then exported to build and register the test suite in the live test module. @@ -10,6 +12,7 @@ Signed-off-by: Michal Wajdeczko michal.wajdeczko@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240708111210.1154-4-michal.w... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/Makefile | 1 - drivers/gpu/drm/xe/tests/xe_live_test_mod.c | 2 ++ @@ -127,3 +130,6 @@ -void xe_migrate_sanity_kunit(struct kunit *test); - -#endif +-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Success |
From: Akshata Jahagirdar akshata.jahagirdar@intel.com
commit 54f07cfc016226c3959e0b3b7ed306124d986ce4 upstream.
This test verifies if the main and ccs data are cleared during bo creation. The motivation to use Kunit instead of IGT is that, although we can verify whether the data is zero following bo creation, we cannot confirm whether the zero value after bo creation is the result of our clear function or simply because the initial data present was zero.
v2: Updated the mutex_lock and unlock logic, Changed out_unlock to out_put. (Matt)
v3: Added missing dma_fence_put(). (Nirmoy)
v4: Rebase.
v5: Add missing bo_put(), bo_unlock() calls. (Matt Auld)
Signed-off-by: Akshata Jahagirdar akshata.jahagirdar@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Acked-by: Nirmoy Das nirmoy.das@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/c07603439b88cfc99e78c0e2069327... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/xe_migrate.c | 276 ++++++++++++++++++++++++++ 1 file changed, 276 insertions(+)
diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c index 0de0e0c666230..353b908845f7d 100644 --- a/drivers/gpu/drm/xe/tests/xe_migrate.c +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c @@ -358,8 +358,284 @@ static void xe_migrate_sanity_kunit(struct kunit *test) xe_call_for_each_device(migrate_test_run_device); }
+static struct dma_fence *blt_copy(struct xe_tile *tile, + struct xe_bo *src_bo, struct xe_bo *dst_bo, + bool copy_only_ccs, const char *str, struct kunit *test) +{ + struct xe_gt *gt = tile->primary_gt; + struct xe_migrate *m = tile->migrate; + struct xe_device *xe = gt_to_xe(gt); + struct dma_fence *fence = NULL; + u64 size = src_bo->size; + struct xe_res_cursor src_it, dst_it; + struct ttm_resource *src = src_bo->ttm.resource, *dst = dst_bo->ttm.resource; + u64 src_L0_ofs, dst_L0_ofs; + u32 src_L0_pt, dst_L0_pt; + u64 src_L0, dst_L0; + int err; + bool src_is_vram = mem_type_is_vram(src->mem_type); + bool dst_is_vram = mem_type_is_vram(dst->mem_type); + + if (!src_is_vram) + xe_res_first_sg(xe_bo_sg(src_bo), 0, size, &src_it); + else + xe_res_first(src, 0, size, &src_it); + + if (!dst_is_vram) + xe_res_first_sg(xe_bo_sg(dst_bo), 0, size, &dst_it); + else + xe_res_first(dst, 0, size, &dst_it); + + while (size) { + u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */ + struct xe_sched_job *job; + struct xe_bb *bb; + u32 flush_flags = 0; + u32 update_idx; + u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; + + src_L0 = xe_migrate_res_sizes(m, &src_it); + dst_L0 = xe_migrate_res_sizes(m, &dst_it); + + src_L0 = min(src_L0, dst_L0); + + batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, + &src_L0_ofs, &src_L0_pt, 0, 0, + avail_pts); + + batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, + &dst_L0_ofs, &dst_L0_pt, 0, + avail_pts, avail_pts); + + /* Add copy commands size here */ + batch_size += ((copy_only_ccs) ? 0 : EMIT_COPY_DW) + + ((xe_device_has_flat_ccs(xe) && copy_only_ccs) ? EMIT_COPY_CCS_DW : 0); + + bb = xe_bb_new(gt, batch_size, xe->info.has_usm); + if (IS_ERR(bb)) { + err = PTR_ERR(bb); + goto err_sync; + } + + if (src_is_vram) + xe_res_next(&src_it, src_L0); + else + emit_pte(m, bb, src_L0_pt, src_is_vram, false, + &src_it, src_L0, src); + + if (dst_is_vram) + xe_res_next(&dst_it, src_L0); + else + emit_pte(m, bb, dst_L0_pt, dst_is_vram, false, + &dst_it, src_L0, dst); + + bb->cs[bb->len++] = MI_BATCH_BUFFER_END; + update_idx = bb->len; + if (!copy_only_ccs) + emit_copy(gt, bb, src_L0_ofs, dst_L0_ofs, src_L0, XE_PAGE_SIZE); + + if (copy_only_ccs) + flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, + src_is_vram, dst_L0_ofs, + dst_is_vram, src_L0, dst_L0_ofs, + copy_only_ccs); + + job = xe_bb_create_migration_job(m->q, bb, + xe_migrate_batch_base(m, xe->info.has_usm), + update_idx); + if (IS_ERR(job)) { + err = PTR_ERR(job); + goto err; + } + + xe_sched_job_add_migrate_flush(job, flush_flags); + + mutex_lock(&m->job_mutex); + xe_sched_job_arm(job); + dma_fence_put(fence); + fence = dma_fence_get(&job->drm.s_fence->finished); + xe_sched_job_push(job); + + dma_fence_put(m->fence); + m->fence = dma_fence_get(fence); + + mutex_unlock(&m->job_mutex); + + xe_bb_free(bb, fence); + size -= src_L0; + continue; + +err: + xe_bb_free(bb, NULL); + +err_sync: + if (fence) { + dma_fence_wait(fence, false); + dma_fence_put(fence); + } + return ERR_PTR(err); + } + + return fence; +} + +static void test_clear(struct xe_device *xe, struct xe_tile *tile, + struct xe_bo *sys_bo, struct xe_bo *vram_bo, struct kunit *test) +{ + struct dma_fence *fence; + u64 expected, retval; + + expected = 0xd0d0d0d0d0d0d0d0; + xe_map_memset(xe, &sys_bo->vmap, 0, 0xd0, sys_bo->size); + + fence = blt_copy(tile, sys_bo, vram_bo, false, "Blit copy from sysmem to vram", test); + if (!sanity_fence_failed(xe, fence, "Blit copy from sysmem to vram", test)) { + retval = xe_map_rd(xe, &vram_bo->vmap, 0, u64); + if (retval == expected) + KUNIT_FAIL(test, "Sanity check failed: VRAM must have compressed value\n"); + } + dma_fence_put(fence); + + fence = blt_copy(tile, vram_bo, sys_bo, false, "Blit copy from vram to sysmem", test); + if (!sanity_fence_failed(xe, fence, "Blit copy from vram to sysmem", test)) { + retval = xe_map_rd(xe, &sys_bo->vmap, 0, u64); + check(retval, expected, "Decompressed value must be equal to initial value", test); + retval = xe_map_rd(xe, &sys_bo->vmap, sys_bo->size - 8, u64); + check(retval, expected, "Decompressed value must be equal to initial value", test); + } + dma_fence_put(fence); + + kunit_info(test, "Clear vram buffer object\n"); + expected = 0x0000000000000000; + fence = xe_migrate_clear(tile->migrate, vram_bo, vram_bo->ttm.resource); + if (sanity_fence_failed(xe, fence, "Clear vram_bo", test)) + return; + dma_fence_put(fence); + + fence = blt_copy(tile, vram_bo, sys_bo, + false, "Blit copy from vram to sysmem", test); + if (!sanity_fence_failed(xe, fence, "Clear main buffer data", test)) { + retval = xe_map_rd(xe, &sys_bo->vmap, 0, u64); + check(retval, expected, "Clear main buffer first value", test); + retval = xe_map_rd(xe, &sys_bo->vmap, sys_bo->size - 8, u64); + check(retval, expected, "Clear main buffer last value", test); + } + dma_fence_put(fence); + + fence = blt_copy(tile, vram_bo, sys_bo, + true, "Blit surf copy from vram to sysmem", test); + if (!sanity_fence_failed(xe, fence, "Clear ccs buffer data", test)) { + retval = xe_map_rd(xe, &sys_bo->vmap, 0, u64); + check(retval, expected, "Clear ccs data first value", test); + retval = xe_map_rd(xe, &sys_bo->vmap, sys_bo->size - 8, u64); + check(retval, expected, "Clear ccs data last value", test); + } + dma_fence_put(fence); +} + +static void validate_ccs_test_run_tile(struct xe_device *xe, struct xe_tile *tile, + struct kunit *test) +{ + struct xe_bo *sys_bo, *vram_bo; + unsigned int bo_flags = XE_BO_FLAG_VRAM_IF_DGFX(tile); + long ret; + + sys_bo = xe_bo_create_user(xe, NULL, NULL, SZ_4M, + DRM_XE_GEM_CPU_CACHING_WC, ttm_bo_type_device, + XE_BO_FLAG_SYSTEM | XE_BO_FLAG_NEEDS_CPU_ACCESS); + + if (IS_ERR(sys_bo)) { + KUNIT_FAIL(test, "xe_bo_create() failed with err=%ld\n", + PTR_ERR(sys_bo)); + return; + } + + xe_bo_lock(sys_bo, false); + ret = xe_bo_validate(sys_bo, NULL, false); + if (ret) { + KUNIT_FAIL(test, "Failed to validate system bo for: %li\n", ret); + goto free_sysbo; + } + + ret = xe_bo_vmap(sys_bo); + if (ret) { + KUNIT_FAIL(test, "Failed to vmap system bo: %li\n", ret); + goto free_sysbo; + } + xe_bo_unlock(sys_bo); + + vram_bo = xe_bo_create_user(xe, NULL, NULL, SZ_4M, DRM_XE_GEM_CPU_CACHING_WC, + ttm_bo_type_device, bo_flags | XE_BO_FLAG_NEEDS_CPU_ACCESS); + if (IS_ERR(vram_bo)) { + KUNIT_FAIL(test, "xe_bo_create() failed with err=%ld\n", + PTR_ERR(vram_bo)); + return; + } + + xe_bo_lock(vram_bo, false); + ret = xe_bo_validate(vram_bo, NULL, false); + if (ret) { + KUNIT_FAIL(test, "Failed to validate vram bo for: %li\n", ret); + goto free_vrambo; + } + + ret = xe_bo_vmap(vram_bo); + if (ret) { + KUNIT_FAIL(test, "Failed to vmap vram bo: %li\n", ret); + goto free_vrambo; + } + + test_clear(xe, tile, sys_bo, vram_bo, test); + xe_bo_unlock(vram_bo); + + xe_bo_lock(vram_bo, false); + xe_bo_vunmap(vram_bo); + xe_bo_unlock(vram_bo); + + xe_bo_lock(sys_bo, false); + xe_bo_vunmap(sys_bo); + xe_bo_unlock(sys_bo); +free_vrambo: + xe_bo_put(vram_bo); +free_sysbo: + xe_bo_put(sys_bo); +} + +static int validate_ccs_test_run_device(struct xe_device *xe) +{ + struct kunit *test = kunit_get_current_test(); + struct xe_tile *tile; + int id; + + if (!xe_device_has_flat_ccs(xe)) { + kunit_info(test, "Skipping non-flat-ccs device.\n"); + return 0; + } + + if (!(GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe))) { + kunit_info(test, "Skipping non-xe2 discrete device %s.\n", + dev_name(xe->drm.dev)); + return 0; + } + + xe_pm_runtime_get(xe); + + for_each_tile(tile, xe, id) + validate_ccs_test_run_tile(xe, tile, test); + + xe_pm_runtime_put(xe); + + return 0; +} + +static void xe_validate_ccs_kunit(struct kunit *test) +{ + xe_call_for_each_device(validate_ccs_test_run_device); +} + static struct kunit_case xe_migrate_tests[] = { KUNIT_CASE(xe_migrate_sanity_kunit), + KUNIT_CASE(xe_validate_ccs_kunit), {} };
[ Sasha's backport helper bot ]
Hi,
The upstream commit SHA1 provided is correct: 54f07cfc016226c3959e0b3b7ed306124d986ce4
WARNING: Author mismatch between patch and upstream commit: Backport author: Lucas De Marchi lucas.demarchi@intel.com Commit author: Akshata Jahagirdar akshata.jahagirdar@intel.com
Status in newer kernel trees: 6.12.y | Present (exact SHA1) 6.11.y | Not found
Note: The patch differs from the upstream commit: --- --- - 2024-11-23 09:22:46.652978735 -0500 +++ /tmp/tmp.PlkTLzjdk1 2024-11-23 09:22:46.648420454 -0500 @@ -1,3 +1,5 @@ +commit 54f07cfc016226c3959e0b3b7ed306124d986ce4 upstream. + This test verifies if the main and ccs data are cleared during bo creation. The motivation to use Kunit instead of IGT is that, although we can verify whether the data is zero following bo creation, @@ -18,6 +20,7 @@ Acked-by: Nirmoy Das nirmoy.das@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/c07603439b88cfc99e78c0e2069327... +Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/xe_migrate.c | 276 ++++++++++++++++++++++++++ 1 file changed, 276 insertions(+) @@ -311,3 +314,6 @@ {} };
+-- +2.47.0 + ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.11.y | Success | Failed |
Build Errors: Build error for stable/linux-6.11.y: In file included from drivers/gpu/drm/xe/xe_migrate.c:1496: drivers/gpu/drm/xe/tests/xe_migrate.c: In function 'blt_copy': drivers/gpu/drm/xe/tests/xe_migrate.c:402:63: error: incompatible type for argument 3 of 'pte_update_size' 402 | batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, | ^~~~~~~~~~~ | | | bool {aka _Bool} drivers/gpu/drm/xe/xe_migrate.c:493:49: note: expected 'struct ttm_resource *' but argument is of type 'bool' {aka '_Bool'} 493 | struct ttm_resource *res, | ~~~~~~~~~~~~~~~~~~~~~^~~ drivers/gpu/drm/xe/tests/xe_migrate.c:402:76: error: passing argument 4 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 402 | batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, | ^~~ | | | struct ttm_resource * drivers/gpu/drm/xe/xe_migrate.c:494:50: note: expected 'struct xe_res_cursor *' but argument is of type 'struct ttm_resource *' 494 | struct xe_res_cursor *cur, | ~~~~~~~~~~~~~~~~~~~~~~^~~ drivers/gpu/drm/xe/tests/xe_migrate.c:402:81: error: passing argument 5 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 402 | batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, | ^~~~~~~ | | | struct xe_res_cursor * drivers/gpu/drm/xe/xe_migrate.c:495:33: note: expected 'u64 *' {aka 'long long unsigned int *'} but argument is of type 'struct xe_res_cursor *' 495 | u64 *L0, u64 *L0_ofs, u32 *L0_pt, | ~~~~~^~ drivers/gpu/drm/xe/tests/xe_migrate.c:403:47: error: passing argument 7 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 403 | &src_L0_ofs, &src_L0_pt, 0, 0, | ^~~~~~~~~~~ | | | u64 * {aka long long unsigned int *} drivers/gpu/drm/xe/xe_migrate.c:495:55: note: expected 'u32 *' {aka 'unsigned int *'} but argument is of type 'u64 *' {aka 'long long unsigned int *'} 495 | u64 *L0, u64 *L0_ofs, u32 *L0_pt, | ~~~~~^~~~~ drivers/gpu/drm/xe/tests/xe_migrate.c:403:60: error: passing argument 8 of 'pte_update_size' makes integer from pointer without a cast [-Wint-conversion] 403 | &src_L0_ofs, &src_L0_pt, 0, 0, | ^~~~~~~~~~ | | | u32 * {aka unsigned int *} drivers/gpu/drm/xe/xe_migrate.c:496:32: note: expected 'u32' {aka 'unsigned int'} but argument is of type 'u32 *' {aka 'unsigned int *'} 496 | u32 cmd_size, u32 pt_ofs, u32 avail_pts) | ~~~~^~~~~~~~ drivers/gpu/drm/xe/tests/xe_migrate.c:402:31: error: too many arguments to function 'pte_update_size' 402 | batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, | ^~~~~~~~~~~~~~~ drivers/gpu/drm/xe/xe_migrate.c:491:12: note: declared here 491 | static u32 pte_update_size(struct xe_migrate *m, | ^~~~~~~~~~~~~~~ drivers/gpu/drm/xe/tests/xe_migrate.c:406:63: error: incompatible type for argument 3 of 'pte_update_size' 406 | batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, | ^~~~~~~~~~~ | | | bool {aka _Bool} drivers/gpu/drm/xe/xe_migrate.c:493:49: note: expected 'struct ttm_resource *' but argument is of type 'bool' {aka '_Bool'} 493 | struct ttm_resource *res, | ~~~~~~~~~~~~~~~~~~~~~^~~ drivers/gpu/drm/xe/tests/xe_migrate.c:406:76: error: passing argument 4 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 406 | batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, | ^~~ | | | struct ttm_resource * drivers/gpu/drm/xe/xe_migrate.c:494:50: note: expected 'struct xe_res_cursor *' but argument is of type 'struct ttm_resource *' 494 | struct xe_res_cursor *cur, | ~~~~~~~~~~~~~~~~~~~~~~^~~ drivers/gpu/drm/xe/tests/xe_migrate.c:406:81: error: passing argument 5 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 406 | batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, | ^~~~~~~ | | | struct xe_res_cursor * drivers/gpu/drm/xe/xe_migrate.c:495:33: note: expected 'u64 *' {aka 'long long unsigned int *'} but argument is of type 'struct xe_res_cursor *' 495 | u64 *L0, u64 *L0_ofs, u32 *L0_pt, | ~~~~~^~ drivers/gpu/drm/xe/tests/xe_migrate.c:407:47: error: passing argument 7 of 'pte_update_size' from incompatible pointer type [-Wincompatible-pointer-types] 407 | &dst_L0_ofs, &dst_L0_pt, 0, | ^~~~~~~~~~~ | | | u64 * {aka long long unsigned int *} drivers/gpu/drm/xe/xe_migrate.c:495:55: note: expected 'u32 *' {aka 'unsigned int *'} but argument is of type 'u64 *' {aka 'long long unsigned int *'} 495 | u64 *L0, u64 *L0_ofs, u32 *L0_pt, | ~~~~~^~~~~ drivers/gpu/drm/xe/tests/xe_migrate.c:407:60: error: passing argument 8 of 'pte_update_size' makes integer from pointer without a cast [-Wint-conversion] 407 | &dst_L0_ofs, &dst_L0_pt, 0, | ^~~~~~~~~~ | | | u32 * {aka unsigned int *} drivers/gpu/drm/xe/xe_migrate.c:496:32: note: expected 'u32' {aka 'unsigned int'} but argument is of type 'u32 *' {aka 'unsigned int *'} 496 | u32 cmd_size, u32 pt_ofs, u32 avail_pts) | ~~~~^~~~~~~~ drivers/gpu/drm/xe/tests/xe_migrate.c:406:31: error: too many arguments to function 'pte_update_size' 406 | batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, | ^~~~~~~~~~~~~~~ drivers/gpu/drm/xe/xe_migrate.c:491:12: note: declared here 491 | static u32 pte_update_size(struct xe_migrate *m, | ^~~~~~~~~~~~~~~ make[6]: *** [scripts/Makefile.build:244: drivers/gpu/drm/xe/xe_migrate.o] Error 1 make[6]: Target 'drivers/gpu/drm/xe/' not remade because of errors. make[5]: *** [scripts/Makefile.build:485: drivers/gpu/drm/xe] Error 2 make[5]: Target 'drivers/gpu/drm/' not remade because of errors. make[4]: *** [scripts/Makefile.build:485: drivers/gpu/drm] Error 2 make[4]: Target 'drivers/gpu/' not remade because of errors. make[3]: *** [scripts/Makefile.build:485: drivers/gpu] Error 2 make[3]: Target 'drivers/' not remade because of errors. make[2]: *** [scripts/Makefile.build:485: drivers] Error 2 make[2]: Target './' not remade because of errors. make[1]: *** [/home/sasha/build/linus-next/Makefile:1926: .] Error 2 make[1]: Target '__all' not remade because of errors. make: *** [Makefile:224: __sub-make] Error 2 make: Target '__all' not remade because of errors.
From: Akshata Jahagirdar akshata.jahagirdar@intel.com
commit 2b808d6b2919cb2fe92901e5087da7b4ed4b9e07 upstream.
Xe2+ has unified compression (exactly one compression mode/format), where compression is now controlled via PAT at PTE level. This simplifies KMD operations, as it can now decompress freely without concern for the buffer's original compression format—unlike DG2, which had multiple compression formats and thus required copying the raw CCS state during VRAM eviction. In addition mixed VRAM and system memory buffers were not supported with compression enabled.
On Xe2 dGPU compression is still only supported with VRAM, however we can now support compression with VRAM and system memory buffers, with GPU access being seamless underneath. So long as when doing VRAM -> system memory the KMD uses compressed -> uncompressed, to decompress it. This also allows CPU access to such buffers, assuming that userspace first decompress the corresponding pages being accessed. If the pages are already in system memory then KMD would have already decompressed them. When restoring such buffers with sysmem -> VRAM the KMD can't easily know which pages were originally compressed, so we always use uncompressed -> uncompressed here. With this it also means we can drop all the raw CCS handling on such platforms (including needing to allocate extra CCS storage).
In order to support this we now need to have two different identity mappings for compressed and uncompressed VRAM. In this patch, we set up the additional identity map for the VRAM with compressed pat_index. We then select the appropriate mapping during migration/clear. During eviction (vram->sysmem), we use the mapping from compressed -> uncompressed. During restore (sysmem->vram), we need the mapping from uncompressed -> uncompressed. Therefore, we need to have two different mappings for compressed and uncompressed vram. We set up an additional identity map for the vram with compressed pat_index. We then select the appropriate mapping during migration/clear.
v2: Formatting nits, Updated code to match recent changes in xe_migrate_prepare_vm(). (Matt)
v3: Move identity map loop to a helper function. (Matt Brost)
v4: Split helper function in different patch, and add asserts and nits. (Matt Brost)
v5: Convert the 2 bool arguments of pte_update_size to flags argument (Matt Brost)
v6: Formatting nits (Matt Brost)
Signed-off-by: Akshata Jahagirdar akshata.jahagirdar@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/b00db5c7267e54260cb6183ba24b15... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/tests/xe_migrate.c | 9 ++- drivers/gpu/drm/xe/xe_migrate.c | 81 +++++++++++++++++++-------- 2 files changed, 66 insertions(+), 24 deletions(-)
diff --git a/drivers/gpu/drm/xe/tests/xe_migrate.c b/drivers/gpu/drm/xe/tests/xe_migrate.c index 353b908845f7d..4af27847f3fd8 100644 --- a/drivers/gpu/drm/xe/tests/xe_migrate.c +++ b/drivers/gpu/drm/xe/tests/xe_migrate.c @@ -393,17 +393,22 @@ static struct dma_fence *blt_copy(struct xe_tile *tile, u32 flush_flags = 0; u32 update_idx; u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; + u32 pte_flags;
src_L0 = xe_migrate_res_sizes(m, &src_it); dst_L0 = xe_migrate_res_sizes(m, &dst_it);
src_L0 = min(src_L0, dst_L0);
- batch_size += pte_update_size(m, src_is_vram, src_is_vram, src, &src_it, &src_L0, + pte_flags = src_is_vram ? (PTE_UPDATE_FLAG_IS_VRAM | + PTE_UPDATE_FLAG_IS_COMP_PTE) : 0; + batch_size += pte_update_size(m, pte_flags, src, &src_it, &src_L0, &src_L0_ofs, &src_L0_pt, 0, 0, avail_pts);
- batch_size += pte_update_size(m, dst_is_vram, dst_is_vram, dst, &dst_it, &src_L0, + pte_flags = dst_is_vram ? (PTE_UPDATE_FLAG_IS_VRAM | + PTE_UPDATE_FLAG_IS_COMP_PTE) : 0; + batch_size += pte_update_size(m, pte_flags, dst, &dst_it, &src_L0, &dst_L0_ofs, &dst_L0_pt, 0, avail_pts, avail_pts);
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index f1cdb6f1fa176..2d7f69ac09a7f 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -73,6 +73,7 @@ struct xe_migrate { #define NUM_PT_SLOTS 32 #define LEVEL0_PAGE_TABLE_ENCODE_SIZE SZ_2M #define MAX_NUM_PTE 512 +#define IDENTITY_OFFSET 256ULL
/* * Although MI_STORE_DATA_IMM's "length" field is 10-bits, 0x3FE is the largest @@ -121,14 +122,19 @@ static u64 xe_migrate_vm_addr(u64 slot, u32 level) return (slot + 1ULL) << xe_pt_shift(level + 1); }
-static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr) +static u64 xe_migrate_vram_ofs(struct xe_device *xe, u64 addr, bool is_comp_pte) { /* * Remove the DPA to get a correct offset into identity table for the * migrate offset */ + u64 identity_offset = IDENTITY_OFFSET; + + if (GRAPHICS_VER(xe) >= 20 && is_comp_pte) + identity_offset += DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G); + addr -= xe->mem.vram.dpa_base; - return addr + (256ULL << xe_pt_shift(2)); + return addr + (identity_offset << xe_pt_shift(2)); }
static void xe_migrate_program_identity(struct xe_device *xe, struct xe_vm *vm, struct xe_bo *bo, @@ -182,11 +188,13 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, struct xe_device *xe = tile_to_xe(tile); u16 pat_index = xe->pat.idx[XE_CACHE_WB]; u8 id = tile->id; - u32 num_entries = NUM_PT_SLOTS, num_level = vm->pt_root[id]->level, - num_setup = num_level + 1; + u32 num_entries = NUM_PT_SLOTS, num_level = vm->pt_root[id]->level; +#define VRAM_IDENTITY_MAP_COUNT 2 + u32 num_setup = num_level + VRAM_IDENTITY_MAP_COUNT; +#undef VRAM_IDENTITY_MAP_COUNT u32 map_ofs, level, i; struct xe_bo *bo, *batch = tile->mem.kernel_bb_pool->bo; - u64 entry, pt30_ofs; + u64 entry, pt29_ofs;
/* Can't bump NUM_PT_SLOTS too high */ BUILD_BUG_ON(NUM_PT_SLOTS > SZ_2M/XE_PAGE_SIZE); @@ -206,9 +214,9 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, if (IS_ERR(bo)) return PTR_ERR(bo);
- /* PT31 reserved for 2M identity map */ - pt30_ofs = bo->size - 2 * XE_PAGE_SIZE; - entry = vm->pt_ops->pde_encode_bo(bo, pt30_ofs, pat_index); + /* PT30 & PT31 reserved for 2M identity map */ + pt29_ofs = bo->size - 3 * XE_PAGE_SIZE; + entry = vm->pt_ops->pde_encode_bo(bo, pt29_ofs, pat_index); xe_pt_write(xe, &vm->pt_root[id]->bo->vmap, 0, entry);
map_ofs = (num_entries - num_setup) * XE_PAGE_SIZE; @@ -260,12 +268,12 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m, } else { u64 batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE);
- m->batch_base_ofs = xe_migrate_vram_ofs(xe, batch_addr); + m->batch_base_ofs = xe_migrate_vram_ofs(xe, batch_addr, false);
if (xe->info.has_usm) { batch = tile->primary_gt->usm.bb_pool->bo; batch_addr = xe_bo_addr(batch, 0, XE_PAGE_SIZE); - m->usm_batch_base_ofs = xe_migrate_vram_ofs(xe, batch_addr); + m->usm_batch_base_ofs = xe_migrate_vram_ofs(xe, batch_addr, false); } }
@@ -299,18 +307,36 @@ static int xe_migrate_prepare_vm(struct xe_tile *tile, struct xe_migrate *m,
/* Identity map the entire vram at 256GiB offset */ if (IS_DGFX(xe)) { - u64 pt31_ofs = bo->size - XE_PAGE_SIZE; + u64 pt30_ofs = bo->size - 2 * XE_PAGE_SIZE;
- xe_migrate_program_identity(xe, vm, bo, map_ofs, 256, pat_index, pt31_ofs); - xe_assert(xe, (xe->mem.vram.actual_physical_size <= SZ_256G)); + xe_migrate_program_identity(xe, vm, bo, map_ofs, IDENTITY_OFFSET, + pat_index, pt30_ofs); + xe_assert(xe, xe->mem.vram.actual_physical_size <= + (MAX_NUM_PTE - IDENTITY_OFFSET) * SZ_1G); + + /* + * Identity map the entire vram for compressed pat_index for xe2+ + * if flat ccs is enabled. + */ + if (GRAPHICS_VER(xe) >= 20 && xe_device_has_flat_ccs(xe)) { + u16 comp_pat_index = xe->pat.idx[XE_CACHE_NONE_COMPRESSION]; + u64 vram_offset = IDENTITY_OFFSET + + DIV_ROUND_UP_ULL(xe->mem.vram.actual_physical_size, SZ_1G); + u64 pt31_ofs = bo->size - XE_PAGE_SIZE; + + xe_assert(xe, xe->mem.vram.actual_physical_size <= (MAX_NUM_PTE - + IDENTITY_OFFSET - IDENTITY_OFFSET / 2) * SZ_1G); + xe_migrate_program_identity(xe, vm, bo, map_ofs, vram_offset, + comp_pat_index, pt31_ofs); + } }
/* * Example layout created above, with root level = 3: * [PT0...PT7]: kernel PT's for copy/clear; 64 or 4KiB PTE's * [PT8]: Kernel PT for VM_BIND, 4 KiB PTE's - * [PT9...PT27]: Userspace PT's for VM_BIND, 4 KiB PTE's - * [PT28 = PDE 0] [PT29 = PDE 1] [PT30 = PDE 2] [PT31 = 2M vram identity map] + * [PT9...PT26]: Userspace PT's for VM_BIND, 4 KiB PTE's + * [PT27 = PDE 0] [PT28 = PDE 1] [PT29 = PDE 2] [PT30 & PT31 = 2M vram identity map] * * This makes the lowest part of the VM point to the pagetables. * Hence the lowest 2M in the vm should point to itself, with a few writes @@ -488,20 +514,26 @@ static bool xe_migrate_allow_identity(u64 size, const struct xe_res_cursor *cur) return cur->size >= size; }
+#define PTE_UPDATE_FLAG_IS_VRAM BIT(0) +#define PTE_UPDATE_FLAG_IS_COMP_PTE BIT(1) + static u32 pte_update_size(struct xe_migrate *m, - bool is_vram, + u32 flags, struct ttm_resource *res, struct xe_res_cursor *cur, u64 *L0, u64 *L0_ofs, u32 *L0_pt, u32 cmd_size, u32 pt_ofs, u32 avail_pts) { u32 cmds = 0; + bool is_vram = PTE_UPDATE_FLAG_IS_VRAM & flags; + bool is_comp_pte = PTE_UPDATE_FLAG_IS_COMP_PTE & flags;
*L0_pt = pt_ofs; if (is_vram && xe_migrate_allow_identity(*L0, cur)) { /* Offset into identity map. */ *L0_ofs = xe_migrate_vram_ofs(tile_to_xe(m->tile), - cur->start + vram_region_gpu_offset(res)); + cur->start + vram_region_gpu_offset(res), + is_comp_pte); cmds += cmd_size; } else { /* Clip L0 to available size */ @@ -780,6 +812,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, u32 update_idx; u64 ccs_ofs, ccs_size; u32 ccs_pt; + u32 pte_flags;
bool usm = xe->info.has_usm; u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; @@ -792,17 +825,19 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
src_L0 = min(src_L0, dst_L0);
- batch_size += pte_update_size(m, src_is_vram, src, &src_it, &src_L0, + pte_flags = src_is_vram ? PTE_UPDATE_FLAG_IS_VRAM : 0; + batch_size += pte_update_size(m, pte_flags, src, &src_it, &src_L0, &src_L0_ofs, &src_L0_pt, 0, 0, avail_pts);
- batch_size += pte_update_size(m, dst_is_vram, dst, &dst_it, &src_L0, + pte_flags = dst_is_vram ? PTE_UPDATE_FLAG_IS_VRAM : 0; + batch_size += pte_update_size(m, pte_flags, dst, &dst_it, &src_L0, &dst_L0_ofs, &dst_L0_pt, 0, avail_pts, avail_pts);
if (copy_system_ccs) { ccs_size = xe_device_ccs_bytes(xe, src_L0); - batch_size += pte_update_size(m, false, NULL, &ccs_it, &ccs_size, + batch_size += pte_update_size(m, 0, NULL, &ccs_it, &ccs_size, &ccs_ofs, &ccs_pt, 0, 2 * avail_pts, avail_pts); @@ -1035,6 +1070,7 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, struct xe_sched_job *job; struct xe_bb *bb; u32 batch_size, update_idx; + u32 pte_flags;
bool usm = xe->info.has_usm; u32 avail_pts = max_mem_transfer_per_pass(xe) / LEVEL0_PAGE_TABLE_ENCODE_SIZE; @@ -1042,8 +1078,9 @@ struct dma_fence *xe_migrate_clear(struct xe_migrate *m, clear_L0 = xe_migrate_res_sizes(m, &src_it);
/* Calculate final sizes and batch size.. */ + pte_flags = clear_vram ? PTE_UPDATE_FLAG_IS_VRAM : 0; batch_size = 2 + - pte_update_size(m, clear_vram, src, &src_it, + pte_update_size(m, pte_flags, src, &src_it, &clear_L0, &clear_L0_ofs, &clear_L0_pt, clear_system_ccs ? 0 : emit_clear_cmd_len(gt), 0, avail_pts); @@ -1159,7 +1196,7 @@ static void write_pgtable(struct xe_tile *tile, struct xe_bb *bb, u64 ppgtt_ofs, if (!ppgtt_ofs) ppgtt_ofs = xe_migrate_vram_ofs(tile_to_xe(tile), xe_bo_addr(update->pt_bo, 0, - XE_PAGE_SIZE)); + XE_PAGE_SIZE), false);
do { u64 addr = ppgtt_ofs + ofs * 8;
From: Akshata Jahagirdar akshata.jahagirdar@intel.com
commit 523f191cc0c728a02a7e5fd0ec26526c41f399ef upstream.
During eviction (vram->sysmem), we use compressed -> uncompressed mapping. During restore (sysmem->vram), we need to use mapping from uncompressed -> uncompressed. Handle logic for selecting the compressed identity map for eviction, and selecting uncompressed map for restore operations. v2: Move check of xe_migrate_ccs_emit() before calling xe_migrate_ccs_copy(). (Nirmoy)
Signed-off-by: Akshata Jahagirdar akshata.jahagirdar@intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/79b3a016e686a662ae68c32b5fc7f0... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_migrate.c | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c index 2d7f69ac09a7f..853bc3fd43705 100644 --- a/drivers/gpu/drm/xe/xe_migrate.c +++ b/drivers/gpu/drm/xe/xe_migrate.c @@ -706,7 +706,7 @@ static u32 xe_migrate_ccs_copy(struct xe_migrate *m, struct xe_gt *gt = m->tile->primary_gt; u32 flush_flags = 0;
- if (xe_device_has_flat_ccs(gt_to_xe(gt)) && !copy_ccs && dst_is_indirect) { + if (!copy_ccs && dst_is_indirect) { /* * If the src is already in vram, then it should already * have been cleared by us, or has been populated by the @@ -782,6 +782,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, bool copy_ccs = xe_device_has_flat_ccs(xe) && xe_bo_needs_ccs_pages(src_bo) && xe_bo_needs_ccs_pages(dst_bo); bool copy_system_ccs = copy_ccs && (!src_is_vram || !dst_is_vram); + bool use_comp_pat = GRAPHICS_VER(xe) >= 20 && IS_DGFX(xe) && src_is_vram && !dst_is_vram;
/* Copying CCS between two different BOs is not supported yet. */ if (XE_WARN_ON(copy_ccs && src_bo != dst_bo)) @@ -808,7 +809,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, u32 batch_size = 2; /* arb_clear() + MI_BATCH_BUFFER_END */ struct xe_sched_job *job; struct xe_bb *bb; - u32 flush_flags; + u32 flush_flags = 0; u32 update_idx; u64 ccs_ofs, ccs_size; u32 ccs_pt; @@ -826,6 +827,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, src_L0 = min(src_L0, dst_L0);
pte_flags = src_is_vram ? PTE_UPDATE_FLAG_IS_VRAM : 0; + pte_flags |= use_comp_pat ? PTE_UPDATE_FLAG_IS_COMP_PTE : 0; batch_size += pte_update_size(m, pte_flags, src, &src_it, &src_L0, &src_L0_ofs, &src_L0_pt, 0, 0, avail_pts); @@ -846,7 +848,7 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m,
/* Add copy commands size here */ batch_size += ((copy_only_ccs) ? 0 : EMIT_COPY_DW) + - ((xe_device_has_flat_ccs(xe) ? EMIT_COPY_CCS_DW : 0)); + ((xe_migrate_needs_ccs_emit(xe) ? EMIT_COPY_CCS_DW : 0));
bb = xe_bb_new(gt, batch_size, usm); if (IS_ERR(bb)) { @@ -875,11 +877,12 @@ struct dma_fence *xe_migrate_copy(struct xe_migrate *m, if (!copy_only_ccs) emit_copy(gt, bb, src_L0_ofs, dst_L0_ofs, src_L0, XE_PAGE_SIZE);
- flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, - IS_DGFX(xe) ? src_is_vram : src_is_pltt, - dst_L0_ofs, - IS_DGFX(xe) ? dst_is_vram : dst_is_pltt, - src_L0, ccs_ofs, copy_ccs); + if (xe_migrate_needs_ccs_emit(xe)) + flush_flags = xe_migrate_ccs_copy(m, bb, src_L0_ofs, + IS_DGFX(xe) ? src_is_vram : src_is_pltt, + dst_L0_ofs, + IS_DGFX(xe) ? dst_is_vram : dst_is_pltt, + src_L0, ccs_ofs, copy_ccs);
job = xe_bb_create_migration_job(m->q, bb, xe_migrate_batch_base(m, usm),
From: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com
commit 2e5d47fe7839298fa096970e184aac9bf82c3bd3 upstream.
Drmm actions are not the right ones to clean up BOs and we should use devm instead. However, we can also instead just allocate the objects using the managed_bo function, which will internally register the correct cleanup call and therefore allows us to simplify the code.
While at it, switch to drmm_kzalloc for the GSC proxy allocation to further simplify the cleanup.
Cc: John Harrison John.C.Harrison@Intel.com Cc: Alan Previn alan.previn.teres.alexis@intel.com Signed-off-by: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240815230541.3828206-1-lucas... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_gsc.c | 12 +++-------- drivers/gpu/drm/xe/xe_gsc_proxy.c | 36 ++++++------------------------- drivers/gpu/drm/xe/xe_huc.c | 19 +++++----------- 3 files changed, 14 insertions(+), 53 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gsc.c b/drivers/gpu/drm/xe/xe_gsc.c index 29f96f4093918..9614d9e6617eb 100644 --- a/drivers/gpu/drm/xe/xe_gsc.c +++ b/drivers/gpu/drm/xe/xe_gsc.c @@ -450,11 +450,6 @@ static void free_resources(void *arg) xe_exec_queue_put(gsc->q); gsc->q = NULL; } - - if (gsc->private) { - xe_bo_unpin_map_no_vm(gsc->private); - gsc->private = NULL; - } }
int xe_gsc_init_post_hwconfig(struct xe_gsc *gsc) @@ -474,10 +469,9 @@ int xe_gsc_init_post_hwconfig(struct xe_gsc *gsc) if (!hwe) return -ENODEV;
- bo = xe_bo_create_pin_map(xe, tile, NULL, SZ_4M, - ttm_bo_type_kernel, - XE_BO_FLAG_STOLEN | - XE_BO_FLAG_GGTT); + bo = xe_managed_bo_create_pin_map(xe, tile, SZ_4M, + XE_BO_FLAG_STOLEN | + XE_BO_FLAG_GGTT); if (IS_ERR(bo)) return PTR_ERR(bo);
diff --git a/drivers/gpu/drm/xe/xe_gsc_proxy.c b/drivers/gpu/drm/xe/xe_gsc_proxy.c index aa812a2bc3edb..8f880c44211d0 100644 --- a/drivers/gpu/drm/xe/xe_gsc_proxy.c +++ b/drivers/gpu/drm/xe/xe_gsc_proxy.c @@ -376,27 +376,6 @@ static const struct component_ops xe_gsc_proxy_component_ops = { .unbind = xe_gsc_proxy_component_unbind, };
-static void proxy_channel_free(struct drm_device *drm, void *arg) -{ - struct xe_gsc *gsc = arg; - - if (!gsc->proxy.bo) - return; - - if (gsc->proxy.to_csme) { - kfree(gsc->proxy.to_csme); - gsc->proxy.to_csme = NULL; - gsc->proxy.from_csme = NULL; - } - - if (gsc->proxy.bo) { - iosys_map_clear(&gsc->proxy.to_gsc); - iosys_map_clear(&gsc->proxy.from_gsc); - xe_bo_unpin_map_no_vm(gsc->proxy.bo); - gsc->proxy.bo = NULL; - } -} - static int proxy_channel_alloc(struct xe_gsc *gsc) { struct xe_gt *gt = gsc_to_gt(gsc); @@ -405,18 +384,15 @@ static int proxy_channel_alloc(struct xe_gsc *gsc) struct xe_bo *bo; void *csme;
- csme = kzalloc(GSC_PROXY_CHANNEL_SIZE, GFP_KERNEL); + csme = drmm_kzalloc(&xe->drm, GSC_PROXY_CHANNEL_SIZE, GFP_KERNEL); if (!csme) return -ENOMEM;
- bo = xe_bo_create_pin_map(xe, tile, NULL, GSC_PROXY_CHANNEL_SIZE, - ttm_bo_type_kernel, - XE_BO_FLAG_SYSTEM | - XE_BO_FLAG_GGTT); - if (IS_ERR(bo)) { - kfree(csme); + bo = xe_managed_bo_create_pin_map(xe, tile, GSC_PROXY_CHANNEL_SIZE, + XE_BO_FLAG_SYSTEM | + XE_BO_FLAG_GGTT); + if (IS_ERR(bo)) return PTR_ERR(bo); - }
gsc->proxy.bo = bo; gsc->proxy.to_gsc = IOSYS_MAP_INIT_OFFSET(&bo->vmap, 0); @@ -424,7 +400,7 @@ static int proxy_channel_alloc(struct xe_gsc *gsc) gsc->proxy.to_csme = csme; gsc->proxy.from_csme = csme + GSC_PROXY_BUFFER_SIZE;
- return drmm_add_action_or_reset(&xe->drm, proxy_channel_free, gsc); + return 0; }
/** diff --git a/drivers/gpu/drm/xe/xe_huc.c b/drivers/gpu/drm/xe/xe_huc.c index bec4366e55138..f5459f97af23f 100644 --- a/drivers/gpu/drm/xe/xe_huc.c +++ b/drivers/gpu/drm/xe/xe_huc.c @@ -43,14 +43,6 @@ huc_to_guc(struct xe_huc *huc) return &container_of(huc, struct xe_uc, huc)->guc; }
-static void free_gsc_pkt(struct drm_device *drm, void *arg) -{ - struct xe_huc *huc = arg; - - xe_bo_unpin_map_no_vm(huc->gsc_pkt); - huc->gsc_pkt = NULL; -} - #define PXP43_HUC_AUTH_INOUT_SIZE SZ_4K static int huc_alloc_gsc_pkt(struct xe_huc *huc) { @@ -59,17 +51,16 @@ static int huc_alloc_gsc_pkt(struct xe_huc *huc) struct xe_bo *bo;
/* we use a single object for both input and output */ - bo = xe_bo_create_pin_map(xe, gt_to_tile(gt), NULL, - PXP43_HUC_AUTH_INOUT_SIZE * 2, - ttm_bo_type_kernel, - XE_BO_FLAG_SYSTEM | - XE_BO_FLAG_GGTT); + bo = xe_managed_bo_create_pin_map(xe, gt_to_tile(gt), + PXP43_HUC_AUTH_INOUT_SIZE * 2, + XE_BO_FLAG_SYSTEM | + XE_BO_FLAG_GGTT); if (IS_ERR(bo)) return PTR_ERR(bo);
huc->gsc_pkt = bo;
- return drmm_add_action_or_reset(&xe->drm, free_gsc_pkt, huc); + return 0; }
int xe_huc_init(struct xe_huc *huc)
From: Rodrigo Vivi rodrigo.vivi@intel.com
commit 6dbd43dcedf3b58a18eb3518e5c19e38a97aa68a upstream.
DPT code is so dependent on i915 vma implementation and it is not ported yet to Xe.
This patch limits inspection to DPT's VMA struct to intel_dpt component only, so the Xe GGTT code can evolve.
Cc: Matthew Brost matthew.brost@intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Juha-Pekka Heikkila juhapekka.heikkila@gmail.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240821193842.352557-4-rodrig... Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/display/intel_dpt.c | 4 ++++ drivers/gpu/drm/i915/display/intel_dpt.h | 3 +++ drivers/gpu/drm/i915/display/skl_universal_plane.c | 3 ++- drivers/gpu/drm/xe/display/xe_fb_pin.c | 9 +++++++-- 4 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c index 73a1918e2537a..3a6d990448289 100644 --- a/drivers/gpu/drm/i915/display/intel_dpt.c +++ b/drivers/gpu/drm/i915/display/intel_dpt.c @@ -317,3 +317,7 @@ void intel_dpt_destroy(struct i915_address_space *vm) i915_vm_put(&dpt->vm); }
+u64 intel_dpt_offset(struct i915_vma *dpt_vma) +{ + return dpt_vma->node.start; +} diff --git a/drivers/gpu/drm/i915/display/intel_dpt.h b/drivers/gpu/drm/i915/display/intel_dpt.h index ff18a525bfbe6..1f88b0ee17e7e 100644 --- a/drivers/gpu/drm/i915/display/intel_dpt.h +++ b/drivers/gpu/drm/i915/display/intel_dpt.h @@ -6,6 +6,8 @@ #ifndef __INTEL_DPT_H__ #define __INTEL_DPT_H__
+#include <linux/types.h> + struct drm_i915_private;
struct i915_address_space; @@ -20,5 +22,6 @@ void intel_dpt_suspend(struct drm_i915_private *i915); void intel_dpt_resume(struct drm_i915_private *i915); struct i915_address_space * intel_dpt_create(struct intel_framebuffer *fb); +u64 intel_dpt_offset(struct i915_vma *dpt_vma);
#endif /* __INTEL_DPT_H__ */ diff --git a/drivers/gpu/drm/i915/display/skl_universal_plane.c b/drivers/gpu/drm/i915/display/skl_universal_plane.c index a1ab64db0130c..834771fc06204 100644 --- a/drivers/gpu/drm/i915/display/skl_universal_plane.c +++ b/drivers/gpu/drm/i915/display/skl_universal_plane.c @@ -14,6 +14,7 @@ #include "intel_de.h" #include "intel_display_irq.h" #include "intel_display_types.h" +#include "intel_dpt.h" #include "intel_fb.h" #include "intel_fbc.h" #include "intel_frontbuffer.h" @@ -1157,7 +1158,7 @@ static u32 skl_surf_address(const struct intel_plane_state *plane_state, * within the DPT is always 0. */ drm_WARN_ON(&i915->drm, plane_state->dpt_vma && - plane_state->dpt_vma->node.start); + intel_dpt_offset(plane_state->dpt_vma)); drm_WARN_ON(&i915->drm, offset & 0x1fffff); return offset >> 9; } else { diff --git a/drivers/gpu/drm/xe/display/xe_fb_pin.c b/drivers/gpu/drm/xe/display/xe_fb_pin.c index d7db44e79eaf5..42d431ff14e75 100644 --- a/drivers/gpu/drm/xe/display/xe_fb_pin.c +++ b/drivers/gpu/drm/xe/display/xe_fb_pin.c @@ -377,8 +377,8 @@ void intel_plane_unpin_fb(struct intel_plane_state *old_plane_state) }
/* - * For Xe introduce dummy intel_dpt_create which just return NULL and - * intel_dpt_destroy which does nothing. + * For Xe introduce dummy intel_dpt_create which just return NULL, + * intel_dpt_destroy which does nothing, and fake intel_dpt_ofsset returning 0; */ struct i915_address_space *intel_dpt_create(struct intel_framebuffer *fb) { @@ -389,3 +389,8 @@ void intel_dpt_destroy(struct i915_address_space *vm) { return; } + +u64 intel_dpt_offset(struct i915_vma *dpt_vma) +{ + return 0; +}
From: Imre Deak imre.deak@intel.com
commit 122824165471ea492d8b07d15384345940aababb upstream.
This is a preparation for the follow-up patch where polling will be handled properly for all cases during runtime suspend/resume.
v2: rebased
Reviewed-by: Arun R Murthy arun.r.murthy@intel.com Signed-off-by: Imre Deak imre.deak@intel.com Signed-off-by: Vinod Govindapillai vinod.govindapillai@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240823112148.327015-3-vinod.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 18 ++++++------------ 1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index c860fda410c82..34b7050fc7c33 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -316,14 +316,11 @@ void xe_display_pm_suspend(struct xe_device *xe, bool runtime) */ intel_power_domains_disable(xe); intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_SUSPENDED, true); - if (has_display(xe)) { + if (!runtime && has_display(xe)) { drm_kms_helper_poll_disable(&xe->drm); - if (!runtime) - intel_display_driver_disable_user_access(xe); - } - - if (!runtime) + intel_display_driver_disable_user_access(xe); intel_display_driver_suspend(xe); + }
xe_display_flush_cleanup_work(xe);
@@ -380,15 +377,12 @@ void xe_display_pm_resume(struct xe_device *xe, bool runtime)
/* MST sideband requires HPD interrupts enabled */ intel_dp_mst_resume(xe); - if (!runtime) + if (!runtime && has_display(xe)) { intel_display_driver_resume(xe); - - if (has_display(xe)) { drm_kms_helper_poll_enable(&xe->drm); - if (!runtime) - intel_display_driver_enable_user_access(xe); + intel_display_driver_enable_user_access(xe); + intel_hpd_poll_disable(xe); } - intel_hpd_poll_disable(xe);
intel_opregion_resume(xe);
From: Vinod Govindapillai vinod.govindapillai@intel.com
commit 66a0f6b9f5fc205272035b6ffa4830be51e3f787 upstream.
In XE, display runtime suspend / resume routines are called only if d3cold is allowed. This makes the driver unable to detect any HPDs once the device goes into runtime suspend state in platforms like LNL. Update the display runtime suspend / resume routines to include HPD polling regardless of d3cold status.
While xe_display_pm_suspend/resume() performs steps during runtime suspend/resume that shouldn't happen, like suspending MST and they are missing other steps like enabling DC9, this patchset is meant to keep the current behavior wrt. these, leaving the corresponding updates for a follow-up
v2: have a separate function for display runtime s/r (Rodrigo)
v3: better streamlining of system s/r and runtime s/r calls (Imre)
v4: rebased
Reviewed-by: Arun R Murthy arun.r.murthy@intel.com Signed-off-by: Vinod Govindapillai vinod.govindapillai@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240823112148.327015-4-vinod.... [ Fix silent conflict due to s/enable_display/probe_display/ ] Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 23 +++++++++++++++++++++++ drivers/gpu/drm/xe/display/xe_display.h | 4 ++++ drivers/gpu/drm/xe/xe_pm.c | 8 +++++--- 3 files changed, 32 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index 34b7050fc7c33..574909e098c17 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -304,6 +304,18 @@ static void xe_display_flush_cleanup_work(struct xe_device *xe) } }
+/* TODO: System and runtime suspend/resume sequences will be sanitized as a follow-up. */ +void xe_display_pm_runtime_suspend(struct xe_device *xe) +{ + if (!xe->info.enable_display) + return; + + if (xe->d3cold.allowed) + xe_display_pm_suspend(xe, true); + + intel_hpd_poll_enable(xe); +} + void xe_display_pm_suspend(struct xe_device *xe, bool runtime) { bool s2idle = suspend_to_idle(); @@ -349,6 +361,17 @@ void xe_display_pm_suspend_late(struct xe_device *xe) intel_display_power_suspend_late(xe); }
+void xe_display_pm_runtime_resume(struct xe_device *xe) +{ + if (!xe->info.enable_display) + return; + + intel_hpd_poll_disable(xe); + + if (xe->d3cold.allowed) + xe_display_pm_resume(xe, true); +} + void xe_display_pm_resume_early(struct xe_device *xe) { if (!xe->info.enable_display) diff --git a/drivers/gpu/drm/xe/display/xe_display.h b/drivers/gpu/drm/xe/display/xe_display.h index 000fb5799df54..53d727fd792b4 100644 --- a/drivers/gpu/drm/xe/display/xe_display.h +++ b/drivers/gpu/drm/xe/display/xe_display.h @@ -38,6 +38,8 @@ void xe_display_pm_suspend(struct xe_device *xe, bool runtime); void xe_display_pm_suspend_late(struct xe_device *xe); void xe_display_pm_resume_early(struct xe_device *xe); void xe_display_pm_resume(struct xe_device *xe, bool runtime); +void xe_display_pm_runtime_suspend(struct xe_device *xe); +void xe_display_pm_runtime_resume(struct xe_device *xe);
#else
@@ -67,6 +69,8 @@ static inline void xe_display_pm_suspend(struct xe_device *xe, bool runtime) {} static inline void xe_display_pm_suspend_late(struct xe_device *xe) {} static inline void xe_display_pm_resume_early(struct xe_device *xe) {} static inline void xe_display_pm_resume(struct xe_device *xe, bool runtime) {} +static inline void xe_display_pm_runtime_suspend(struct xe_device *xe) {} +static inline void xe_display_pm_runtime_resume(struct xe_device *xe) {}
#endif /* CONFIG_DRM_XE_DISPLAY */ #endif /* _XE_DISPLAY_H_ */ diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 9a3f618d22dcb..0f6f0526efbc0 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -362,9 +362,9 @@ int xe_pm_runtime_suspend(struct xe_device *xe) xe_bo_runtime_pm_release_mmap_offset(bo); mutex_unlock(&xe->mem_access.vram_userfault.lock);
- if (xe->d3cold.allowed) { - xe_display_pm_suspend(xe, true); + xe_display_pm_runtime_suspend(xe);
+ if (xe->d3cold.allowed) { err = xe_bo_evict_all(xe); if (err) goto out; @@ -426,12 +426,14 @@ int xe_pm_runtime_resume(struct xe_device *xe) for_each_gt(gt, xe, id) xe_gt_resume(gt);
+ xe_display_pm_runtime_resume(xe); + if (xe->d3cold.allowed) { - xe_display_pm_resume(xe, true); err = xe_bo_restore_user(xe); if (err) goto out; } + out: lock_map_release(&xe_pm_runtime_lockdep_map); xe_pm_write_callback_task(xe, NULL);
From: Imre Deak imre.deak@intel.com
commit fef0bcf72b9506019ecd5440061d7df7f50b02b0 upstream.
If the device is runtime suspended the eDP panel power is also off. Ignore a short HPD on eDP if the device is suspended accordingly, instead of checking the panel power state via the PPS registers for the same purpose. The latter involves runtime resuming the device unnecessarily, in a frequent scenario where the panel generates a spurious short HPD after disabling the panel power and the device is runtime suspended.
Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-2-imre.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/display/intel_dp.c | 5 ++++- drivers/gpu/drm/i915/intel_runtime_pm.h | 8 +++++++- drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h | 8 ++++++++ 3 files changed, 19 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 4ec724e8b2207..31cec30079509 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -82,6 +82,7 @@ #include "intel_pch_display.h" #include "intel_pps.h" #include "intel_psr.h" +#include "intel_runtime_pm.h" #include "intel_quirks.h" #include "intel_tc.h" #include "intel_vdsc.h" @@ -6408,7 +6409,9 @@ intel_dp_hpd_pulse(struct intel_digital_port *dig_port, bool long_hpd) u8 dpcd[DP_RECEIVER_CAP_SIZE];
if (dig_port->base.type == INTEL_OUTPUT_EDP && - (long_hpd || !intel_pps_have_panel_power_or_vdd(intel_dp))) { + (long_hpd || + intel_runtime_pm_suspended(&i915->runtime_pm) || + !intel_pps_have_panel_power_or_vdd(intel_dp))) { /* * vdd off can generate a long/short pulse on eDP which * would require vdd on to handle it, and thus we diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.h b/drivers/gpu/drm/i915/intel_runtime_pm.h index de3579d399e18..335d536c441b4 100644 --- a/drivers/gpu/drm/i915/intel_runtime_pm.h +++ b/drivers/gpu/drm/i915/intel_runtime_pm.h @@ -97,10 +97,16 @@ intel_rpm_wakelock_count(int wakeref_count) return wakeref_count >> INTEL_RPM_WAKELOCK_SHIFT; }
+static inline bool +intel_runtime_pm_suspended(struct intel_runtime_pm *rpm) +{ + return pm_runtime_suspended(rpm->kdev); +} + static inline void assert_rpm_device_not_suspended(struct intel_runtime_pm *rpm) { - WARN_ONCE(pm_runtime_suspended(rpm->kdev), + WARN_ONCE(intel_runtime_pm_suspended(rpm), "Device suspended during HW access\n"); }
diff --git a/drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h b/drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h index 8c7b315aa8acd..ab4455e869dc2 100644 --- a/drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h +++ b/drivers/gpu/drm/xe/compat-i915-headers/intel_runtime_pm.h @@ -20,6 +20,14 @@ static inline void enable_rpm_wakeref_asserts(void *rpm) { }
+static inline bool +intel_runtime_pm_suspended(struct xe_runtime_pm *pm) +{ + struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm); + + return pm_runtime_suspended(xe->drm.dev); +} + static inline intel_wakeref_t intel_runtime_pm_get(struct xe_runtime_pm *pm) { struct xe_device *xe = container_of(pm, struct xe_device, runtime_pm);
From: Imre Deak imre.deak@intel.com
commit a31f62f693c87316eea1711ab586f8f5a7d7a0b3 upstream.
A registered eDP connector is considered to be always connected, so it's unnecessary to poll it for a connect/disconnect event. Polling it involves AUX accesses toggling the panel power, which in turn can generate a spurious short HPD pulse and possibly a new poll cycle via the short HPD handler runtime resuming the device. Avoid this by disabling the polling for eDP connectors.
This avoids IGT tests timing out while waiting for the device to runtime suspend, the timeout caused by the above runtime resume->poll->suspend-> resume cycle keeping the device in the resumed state.
Testcase: igt/kms_pm_rpm/unverisal-planes Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-3-imre.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/display/intel_dp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 31cec30079509..6ce3d9c4904eb 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -6842,7 +6842,8 @@ intel_dp_init_connector(struct intel_digital_port *dig_port, if (!HAS_GMCH(dev_priv) && DISPLAY_VER(dev_priv) < 12) connector->interlace_allowed = true;
- intel_connector->polled = DRM_CONNECTOR_POLL_HPD; + if (type != DRM_MODE_CONNECTOR_eDP) + intel_connector->polled = DRM_CONNECTOR_POLL_HPD; intel_connector->base.polled = intel_connector->polled;
intel_connector_attach_encoder(intel_connector, intel_encoder);
From: Maarten Lankhorst maarten.lankhorst@linux.intel.com
commit f90491d4b64e302e940133103d3d9908e70e454f upstream.
The previous change ensures that pm_suspend is only called when suspending or resuming. This ensures no further bugs like those in the previous commit.
Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Reviewed-by: Vinod Govindapillai vinod.govindapillai@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-3-maarte... [ s/probe_display/enable_display/ to fix conflicts ] Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 53 +++++++++++++++---------- drivers/gpu/drm/xe/display/xe_display.h | 8 ++-- drivers/gpu/drm/xe/xe_pm.c | 6 +-- 3 files changed, 39 insertions(+), 28 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index 574909e098c17..04e99d1beb21a 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -305,18 +305,7 @@ static void xe_display_flush_cleanup_work(struct xe_device *xe) }
/* TODO: System and runtime suspend/resume sequences will be sanitized as a follow-up. */ -void xe_display_pm_runtime_suspend(struct xe_device *xe) -{ - if (!xe->info.enable_display) - return; - - if (xe->d3cold.allowed) - xe_display_pm_suspend(xe, true); - - intel_hpd_poll_enable(xe); -} - -void xe_display_pm_suspend(struct xe_device *xe, bool runtime) +static void __xe_display_pm_suspend(struct xe_device *xe, bool runtime) { bool s2idle = suspend_to_idle(); if (!xe->info.enable_display) @@ -350,26 +339,31 @@ void xe_display_pm_suspend(struct xe_device *xe, bool runtime) intel_dmc_suspend(xe); }
-void xe_display_pm_suspend_late(struct xe_device *xe) +void xe_display_pm_suspend(struct xe_device *xe) +{ + __xe_display_pm_suspend(xe, false); +} + +void xe_display_pm_runtime_suspend(struct xe_device *xe) { - bool s2idle = suspend_to_idle(); if (!xe->info.enable_display) return;
- intel_power_domains_suspend(xe, s2idle); + if (xe->d3cold.allowed) + __xe_display_pm_suspend(xe, true);
- intel_display_power_suspend_late(xe); + intel_hpd_poll_enable(xe); }
-void xe_display_pm_runtime_resume(struct xe_device *xe) +void xe_display_pm_suspend_late(struct xe_device *xe) { + bool s2idle = suspend_to_idle(); if (!xe->info.enable_display) return;
- intel_hpd_poll_disable(xe); + intel_power_domains_suspend(xe, s2idle);
- if (xe->d3cold.allowed) - xe_display_pm_resume(xe, true); + intel_display_power_suspend_late(xe); }
void xe_display_pm_resume_early(struct xe_device *xe) @@ -382,7 +376,7 @@ void xe_display_pm_resume_early(struct xe_device *xe) intel_power_domains_resume(xe); }
-void xe_display_pm_resume(struct xe_device *xe, bool runtime) +static void __xe_display_pm_resume(struct xe_device *xe, bool runtime) { if (!xe->info.enable_display) return; @@ -414,6 +408,23 @@ void xe_display_pm_resume(struct xe_device *xe, bool runtime) intel_power_domains_enable(xe); }
+void xe_display_pm_resume(struct xe_device *xe) +{ + __xe_display_pm_resume(xe, false); +} + +void xe_display_pm_runtime_resume(struct xe_device *xe) +{ + if (!xe->info.enable_display) + return; + + intel_hpd_poll_disable(xe); + + if (xe->d3cold.allowed) + __xe_display_pm_resume(xe, true); +} + + static void display_device_remove(struct drm_device *dev, void *arg) { struct xe_device *xe = arg; diff --git a/drivers/gpu/drm/xe/display/xe_display.h b/drivers/gpu/drm/xe/display/xe_display.h index 53d727fd792b4..bed55fd26f304 100644 --- a/drivers/gpu/drm/xe/display/xe_display.h +++ b/drivers/gpu/drm/xe/display/xe_display.h @@ -34,10 +34,10 @@ void xe_display_irq_enable(struct xe_device *xe, u32 gu_misc_iir); void xe_display_irq_reset(struct xe_device *xe); void xe_display_irq_postinstall(struct xe_device *xe, struct xe_gt *gt);
-void xe_display_pm_suspend(struct xe_device *xe, bool runtime); +void xe_display_pm_suspend(struct xe_device *xe); void xe_display_pm_suspend_late(struct xe_device *xe); void xe_display_pm_resume_early(struct xe_device *xe); -void xe_display_pm_resume(struct xe_device *xe, bool runtime); +void xe_display_pm_resume(struct xe_device *xe); void xe_display_pm_runtime_suspend(struct xe_device *xe); void xe_display_pm_runtime_resume(struct xe_device *xe);
@@ -65,10 +65,10 @@ static inline void xe_display_irq_enable(struct xe_device *xe, u32 gu_misc_iir) static inline void xe_display_irq_reset(struct xe_device *xe) {} static inline void xe_display_irq_postinstall(struct xe_device *xe, struct xe_gt *gt) {}
-static inline void xe_display_pm_suspend(struct xe_device *xe, bool runtime) {} +static inline void xe_display_pm_suspend(struct xe_device *xe) {} static inline void xe_display_pm_suspend_late(struct xe_device *xe) {} static inline void xe_display_pm_resume_early(struct xe_device *xe) {} -static inline void xe_display_pm_resume(struct xe_device *xe, bool runtime) {} +static inline void xe_display_pm_resume(struct xe_device *xe) {} static inline void xe_display_pm_runtime_suspend(struct xe_device *xe) {} static inline void xe_display_pm_runtime_resume(struct xe_device *xe) {}
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 0f6f0526efbc0..1402156e4b7e6 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -91,7 +91,7 @@ int xe_pm_suspend(struct xe_device *xe) for_each_gt(gt, xe, id) xe_gt_suspend_prepare(gt);
- xe_display_pm_suspend(xe, false); + xe_display_pm_suspend(xe);
/* FIXME: Super racey... */ err = xe_bo_evict_all(xe); @@ -101,7 +101,7 @@ int xe_pm_suspend(struct xe_device *xe) for_each_gt(gt, xe, id) { err = xe_gt_suspend(gt); if (err) { - xe_display_pm_resume(xe, false); + xe_display_pm_resume(xe); goto err; } } @@ -154,7 +154,7 @@ int xe_pm_resume(struct xe_device *xe) for_each_gt(gt, xe, id) xe_gt_resume(gt);
- xe_display_pm_resume(xe, false); + xe_display_pm_resume(xe);
err = xe_bo_restore_user(xe); if (err)
From: Imre Deak imre.deak@intel.com
commit a4de6beb83fc5adee788518350247c629568901e upstream.
For clarity separate the d3cold and non-d3cold runtime PM handling. The only change in behavior is disabling polling later during runtime resume. This shouldn't make a difference, since the poll disabling is handled from a work, which could run at any point wrt. the runtime resume handler. The work will also require a runtime PM reference, syncing it with the resume handler.
Cc: Rodrigo Vivi rodrigo.vivi@intel.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-4-imre.... [ Fix conflict: intel_opregion_resume() takes xe as argument instead of display ] Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index 04e99d1beb21a..b011a1e3ffa38 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -337,6 +337,9 @@ static void __xe_display_pm_suspend(struct xe_device *xe, bool runtime) intel_opregion_suspend(xe, s2idle ? PCI_D1 : PCI_D3cold);
intel_dmc_suspend(xe); + + if (runtime && has_display(xe)) + intel_hpd_poll_enable(xe); }
void xe_display_pm_suspend(struct xe_device *xe) @@ -349,8 +352,10 @@ void xe_display_pm_runtime_suspend(struct xe_device *xe) if (!xe->info.enable_display) return;
- if (xe->d3cold.allowed) + if (xe->d3cold.allowed) { __xe_display_pm_suspend(xe, true); + return; + }
intel_hpd_poll_enable(xe); } @@ -398,9 +403,11 @@ static void __xe_display_pm_resume(struct xe_device *xe, bool runtime) intel_display_driver_resume(xe); drm_kms_helper_poll_enable(&xe->drm); intel_display_driver_enable_user_access(xe); - intel_hpd_poll_disable(xe); }
+ if (has_display(xe)) + intel_hpd_poll_disable(xe); + intel_opregion_resume(xe);
intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_RUNNING, false); @@ -418,10 +425,12 @@ void xe_display_pm_runtime_resume(struct xe_device *xe) if (!xe->info.enable_display) return;
- intel_hpd_poll_disable(xe); - - if (xe->d3cold.allowed) + if (xe->d3cold.allowed) { __xe_display_pm_resume(xe, true); + return; + } + + intel_hpd_poll_disable(xe); }
From: Imre Deak imre.deak@intel.com
commit bbc4a30de095f0349d3c278500345a1b620d495e upstream.
Atm the display HPD interrupts that got disabled during runtime suspend, are re-enabled only if d3cold is enabled. Fix things by also re-enabling the interrupts if d3cold is disabled.
Cc: Rodrigo Vivi rodrigo.vivi@intel.com Reviewed-by: Jonathan Cavitt jonathan.cavitt@intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-5-imre.... (cherry picked from commit bbc4a30de095f0349d3c278500345a1b620d495e) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index b011a1e3ffa38..696e3cd716991 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -430,6 +430,7 @@ void xe_display_pm_runtime_resume(struct xe_device *xe) return; }
+ intel_hpd_init(xe); intel_hpd_poll_disable(xe); }
From: Thomas Hellström thomas.hellstrom@linux.intel.com
commit 379cad69bdfe522e840ed5f5c01ac8769006d53e upstream.
For non-d3cold-capable devices we'd like to be able to wake up the device from reclaim. In particular, for Lunar Lake we'd like to be able to blit CCS metadata to system at shrink time; at least from kswapd where it's reasonable OK to wait for rpm resume and a preceding rpm suspend.
Therefore use a separate lockdep map for such devices and prime it reclaim-tainted.
v2: - Rename lockmap acquire- and release functions. (Rodrigo Vivi). - Reinstate the old xe_pm_runtime_lockdep_prime() function and rename it to xe_rpm_might_enter_cb(). (Matthew Auld). - Introduce a separate xe_pm_runtime_lockdep_prime function called from module init for known required locking orders. v3: - Actually hook up the prime function at module init. v4: - Rebase. v5: - Don't use reclaim-safe RPM with sriov.
Cc: "Vivi, Rodrigo" rodrigo.vivi@intel.com Cc: "Auld, Matthew" matthew.auld@intel.com Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240826143450.92511-1-thomas.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_module.c | 9 ++++ drivers/gpu/drm/xe/xe_pm.c | 84 ++++++++++++++++++++++++++++------ drivers/gpu/drm/xe/xe_pm.h | 1 + 3 files changed, 80 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c index 464ec24e2dd20..5791670c22a3b 100644 --- a/drivers/gpu/drm/xe/xe_module.c +++ b/drivers/gpu/drm/xe/xe_module.c @@ -13,6 +13,7 @@ #include "xe_drv.h" #include "xe_hw_fence.h" #include "xe_pci.h" +#include "xe_pm.h" #include "xe_observation.h" #include "xe_sched_job.h"
@@ -76,6 +77,10 @@ struct init_funcs { void (*exit)(void); };
+static void xe_dummy_exit(void) +{ +} + static const struct init_funcs init_funcs[] = { { .init = xe_check_nomodeset, @@ -96,6 +101,10 @@ static const struct init_funcs init_funcs[] = { .init = xe_observation_sysctl_register, .exit = xe_observation_sysctl_unregister, }, + { + .init = xe_pm_module_init, + .exit = xe_dummy_exit, + }, };
static int __init xe_call_init_func(unsigned int i) diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 1402156e4b7e6..356c66fb74a4f 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -69,11 +69,34 @@ */
#ifdef CONFIG_LOCKDEP -static struct lockdep_map xe_pm_runtime_lockdep_map = { - .name = "xe_pm_runtime_lockdep_map" +static struct lockdep_map xe_pm_runtime_d3cold_map = { + .name = "xe_rpm_d3cold_map" +}; + +static struct lockdep_map xe_pm_runtime_nod3cold_map = { + .name = "xe_rpm_nod3cold_map" }; #endif
+static bool __maybe_unused xe_rpm_reclaim_safe(const struct xe_device *xe) +{ + return !xe->d3cold.capable && !xe->info.has_sriov; +} + +static void xe_rpm_lockmap_acquire(const struct xe_device *xe) +{ + lock_map_acquire(xe_rpm_reclaim_safe(xe) ? + &xe_pm_runtime_nod3cold_map : + &xe_pm_runtime_d3cold_map); +} + +static void xe_rpm_lockmap_release(const struct xe_device *xe) +{ + lock_map_release(xe_rpm_reclaim_safe(xe) ? + &xe_pm_runtime_nod3cold_map : + &xe_pm_runtime_d3cold_map); +} + /** * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle * @xe: xe device instance @@ -350,7 +373,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe) * annotation here and in xe_pm_runtime_get() lockdep will see * the potential lock inversion and give us a nice splat. */ - lock_map_acquire(&xe_pm_runtime_lockdep_map); + xe_rpm_lockmap_acquire(xe);
/* * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify @@ -383,7 +406,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe) out: if (err) xe_display_pm_resume(xe, true); - lock_map_release(&xe_pm_runtime_lockdep_map); + xe_rpm_lockmap_release(xe); xe_pm_write_callback_task(xe, NULL); return err; } @@ -403,7 +426,7 @@ int xe_pm_runtime_resume(struct xe_device *xe) /* Disable access_ongoing asserts and prevent recursive pm calls */ xe_pm_write_callback_task(xe, current);
- lock_map_acquire(&xe_pm_runtime_lockdep_map); + xe_rpm_lockmap_acquire(xe);
if (xe->d3cold.allowed) { err = xe_pcode_ready(xe, true); @@ -435,7 +458,7 @@ int xe_pm_runtime_resume(struct xe_device *xe) }
out: - lock_map_release(&xe_pm_runtime_lockdep_map); + xe_rpm_lockmap_release(xe); xe_pm_write_callback_task(xe, NULL); return err; } @@ -449,15 +472,37 @@ int xe_pm_runtime_resume(struct xe_device *xe) * stuff that can happen inside the runtime_resume callback by acquiring * a dummy lock (it doesn't protect anything and gets compiled out on * non-debug builds). Lockdep then only needs to see the - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map. + * xe_pm_runtime_xxx_map -> runtime_resume callback once, and then can + * hopefully validate all the (callers_locks) -> xe_pm_runtime_xxx_map. * For example if the (callers_locks) are ever grabbed in the * runtime_resume callback, lockdep should give us a nice splat. */ -static void pm_runtime_lockdep_prime(void) +static void xe_rpm_might_enter_cb(const struct xe_device *xe) { - lock_map_acquire(&xe_pm_runtime_lockdep_map); - lock_map_release(&xe_pm_runtime_lockdep_map); + xe_rpm_lockmap_acquire(xe); + xe_rpm_lockmap_release(xe); +} + +/* + * Prime the lockdep maps for known locking orders that need to + * be supported but that may not always occur on all systems. + */ +static void xe_pm_runtime_lockdep_prime(void) +{ + struct dma_resv lockdep_resv; + + dma_resv_init(&lockdep_resv); + lock_map_acquire(&xe_pm_runtime_d3cold_map); + /* D3Cold takes the dma_resv locks to evict bos */ + dma_resv_lock(&lockdep_resv, NULL); + dma_resv_unlock(&lockdep_resv); + lock_map_release(&xe_pm_runtime_d3cold_map); + + /* Shrinkers might like to wake up the device under reclaim. */ + fs_reclaim_acquire(GFP_KERNEL); + lock_map_acquire(&xe_pm_runtime_nod3cold_map); + lock_map_release(&xe_pm_runtime_nod3cold_map); + fs_reclaim_release(GFP_KERNEL); }
/** @@ -471,7 +516,7 @@ void xe_pm_runtime_get(struct xe_device *xe) if (xe_pm_read_callback_task(xe) == current) return;
- pm_runtime_lockdep_prime(); + xe_rpm_might_enter_cb(xe); pm_runtime_resume(xe->drm.dev); }
@@ -501,7 +546,7 @@ int xe_pm_runtime_get_ioctl(struct xe_device *xe) if (WARN_ON(xe_pm_read_callback_task(xe) == current)) return -ELOOP;
- pm_runtime_lockdep_prime(); + xe_rpm_might_enter_cb(xe); return pm_runtime_get_sync(xe->drm.dev); }
@@ -569,7 +614,7 @@ bool xe_pm_runtime_resume_and_get(struct xe_device *xe) return true; }
- pm_runtime_lockdep_prime(); + xe_rpm_might_enter_cb(xe); return pm_runtime_resume_and_get(xe->drm.dev) >= 0; }
@@ -661,3 +706,14 @@ void xe_pm_d3cold_allowed_toggle(struct xe_device *xe) drm_dbg(&xe->drm, "d3cold: allowed=%s\n", str_yes_no(xe->d3cold.allowed)); } + +/** + * xe_pm_module_init() - Perform xe_pm specific module initialization. + * + * Return: 0 on success. Currently doesn't fail. + */ +int __init xe_pm_module_init(void) +{ + xe_pm_runtime_lockdep_prime(); + return 0; +} diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h index 104a21ae6dfd0..9aef673b1c8ad 100644 --- a/drivers/gpu/drm/xe/xe_pm.h +++ b/drivers/gpu/drm/xe/xe_pm.h @@ -32,5 +32,6 @@ void xe_pm_assert_unbounded_bridge(struct xe_device *xe); int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold); void xe_pm_d3cold_allowed_toggle(struct xe_device *xe); struct task_struct *xe_pm_read_callback_task(struct xe_device *xe); +int xe_pm_module_init(void);
#endif
From: Maarten Lankhorst maarten.lankhorst@linux.intel.com
commit 474f64cb988a410db8a0b779d6afdaa2a7fc5759 upstream.
This error path was missed when converting away from xe_display_pm_resume with second argument.
Fixes: 66a0f6b9f5fc ("drm/xe/display: handle HPD polling in display runtime suspend/resume") Cc: Arun R Murthy arun.r.murthy@intel.com Cc: Vinod Govindapillai vinod.govindapillai@intel.com Signed-off-by: Maarten Lankhorst maarten.lankhorst@linux.intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Reviewed-by: Vinod Govindapillai vinod.govindapillai@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240905150052.174895-2-maarte... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_pm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c index 356c66fb74a4f..d6501048f6418 100644 --- a/drivers/gpu/drm/xe/xe_pm.c +++ b/drivers/gpu/drm/xe/xe_pm.c @@ -405,7 +405,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe) xe_display_pm_suspend_late(xe); out: if (err) - xe_display_pm_resume(xe, true); + xe_display_pm_runtime_resume(xe); xe_rpm_lockmap_release(xe); xe_pm_write_callback_task(xe, NULL); return err;
From: Gustavo Sousa gustavo.sousa@intel.com
commit 3bf90935aafc750c838c8831e96c3ac36cfd48d5 upstream.
With exception of "Tuning: L3 cache - media", we are currently applying recommended performance tuning settings only for the primary GT. Let's also implement them for the media GT when applicable.
According to our spec, media GT registers CCCHKNREG1 and L3SQCREG* exist only in Xe2_LPM and their offsets do not match their primary GT counterparts. Furthermore, the range where CCCHKNREG1 belongs is not listed as a multicast range on the media GT. As such, we need to have Xe2_LPM-specific definitions for those registers and apply the setting only for that specific IP.
Both Xe2_HPM and Xe2_LPM contain STATELESS_COMPRESSION_CTRL and the offset on the media GT matches the one on the primary one. So we can simply have a copy of "Tuning: Stateless compression control" for the media GT.
v2: - Fix implementation with respect to multicast vs non-multicast registers. (Matt) - Add missing XE2LPM_CCCHKNREG1 on second action of "Tuning: Compression Overfetch - media". v3: - STATELESS_COMPRESSION_CTRL on Xe2_HPM is also a multicast register, do not define a XE2HPM_STATELESS_COMPRESSION_CTRL register. (Tejas)
Bspec: 72161 Cc: Matt Roper matthew.d.roper@intel.com Reviewed-by: Tejas Upadhyay tejas.upadhyay@intel.com Signed-off-by: Gustavo Sousa gustavo.sousa@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240920211459.255181-3-gustav... (cherry picked from commit e1f813947ccf2326cfda4558b7d31430d7860c4b) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 6 ++++++ drivers/gpu/drm/xe/xe_tuning.c | 20 ++++++++++++++++++++ 2 files changed, 26 insertions(+)
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 4f0027d93efcb..d1b5be2585eda 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -169,6 +169,8 @@ #define XEHP_SLICE_COMMON_ECO_CHICKEN1 XE_REG_MCR(0x731c, XE_REG_OPTION_MASKED) #define MSC_MSAA_REODER_BUF_BYPASS_DISABLE REG_BIT(14)
+#define XE2LPM_CCCHKNREG1 XE_REG(0x82a8) + #define VF_PREEMPTION XE_REG(0x83a4, XE_REG_OPTION_MASKED) #define PREEMPTION_VERTEX_COUNT REG_GENMASK(15, 0)
@@ -391,6 +393,10 @@ #define SCRATCH1LPFC XE_REG(0xb474) #define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0)
+#define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604) + +#define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608) + #define XE2LPM_L3SQCREG5 XE_REG_MCR(0xb658)
#define XE2_TDF_CTRL XE_REG(0xb418) diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c index faa1bf42e50ed..c798ae1b3f750 100644 --- a/drivers/gpu/drm/xe/xe_tuning.c +++ b/drivers/gpu/drm/xe/xe_tuning.c @@ -42,20 +42,40 @@ static const struct xe_rtp_entry_sr gt_tunings[] = { XE_RTP_ACTIONS(CLR(CCCHKNREG1, ENCOMPPERFFIX), SET(CCCHKNREG1, L3CMPCTRL)) }, + { XE_RTP_NAME("Tuning: Compression Overfetch - media"), + XE_RTP_RULES(MEDIA_VERSION(2000)), + XE_RTP_ACTIONS(CLR(XE2LPM_CCCHKNREG1, ENCOMPPERFFIX), + SET(XE2LPM_CCCHKNREG1, L3CMPCTRL)) + }, { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)), XE_RTP_ACTIONS(SET(L3SQCREG3, COMPPWOVERFETCHEN)) }, + { XE_RTP_NAME("Tuning: Enable compressible partial write overfetch in L3 - media"), + XE_RTP_RULES(MEDIA_VERSION(2000)), + XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG3, COMPPWOVERFETCHEN)) + }, { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)), XE_RTP_ACTIONS(SET(L3SQCREG2, COMPMEMRD256BOVRFETCHEN)) }, + { XE_RTP_NAME("Tuning: L2 Overfetch Compressible Only - media"), + XE_RTP_RULES(MEDIA_VERSION(2000)), + XE_RTP_ACTIONS(SET(XE2LPM_L3SQCREG2, + COMPMEMRD256BOVRFETCHEN)) + }, { XE_RTP_NAME("Tuning: Stateless compression control"), XE_RTP_RULES(GRAPHICS_VERSION_RANGE(2001, XE_RTP_END_VERSION_UNDEFINED)), XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT, REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0))) }, + { XE_RTP_NAME("Tuning: Stateless compression control - media"), + XE_RTP_RULES(MEDIA_VERSION_RANGE(1301, 2000)), + XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT, + REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0))) + }, + {} };
From: Gustavo Sousa gustavo.sousa@intel.com
commit 6ef5a04221aaeb858d1a825b2ecb7e200cac80f8 upstream.
A recommended performance tuning for LNL related to L3 cache flushing was recently introduced in Bspec. Implement it.
Unlike the other existing tuning settings, we limit this one for LNL only, since there is no info about whether this would be applicable to other platforms yet. In the future we can come back and use IP version ranges if applicable.
v2: - Fix reference to Bspec. (Sai Teja, Tejas) - Use correct register name for "Tuning: L3 RW flush all Cache". (Sai Teja) - Use SCRATCH3_LBCF (with the underscore) for better readability. v3: - Limit setting to LNL only. (Matt)
Bspec: 72161 Cc: Sai Teja Pottumuttu sai.teja.pottumuttu@intel.com Cc: Tejas Upadhyay tejas.upadhyay@intel.com Cc: Matt Roper matthew.d.roper@intel.com Signed-off-by: Gustavo Sousa gustavo.sousa@intel.com Reviewed-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Tejas Upadhyay tejas.upadhyay@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240920211459.255181-5-gustav... (cherry picked from commit 876253165f3eaaacacb8c8bed16a9df4b6081479) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 5 +++++ drivers/gpu/drm/xe/xe_tuning.c | 8 ++++++++ 2 files changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index d1b5be2585eda..224ab4a425258 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -380,6 +380,9 @@ #define L3SQCREG3 XE_REG_MCR(0xb108) #define COMPPWOVERFETCHEN REG_BIT(28)
+#define SCRATCH3_LBCF XE_REG_MCR(0xb154) +#define RWFLUSHALLEN REG_BIT(17) + #define XEHP_L3SQCREG5 XE_REG_MCR(0xb158) #define L3_PWM_TIMER_INIT_VAL_MASK REG_GENMASK(9, 0)
@@ -397,6 +400,8 @@
#define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608)
+#define XE2LPM_SCRATCH3_LBCF XE_REG_MCR(0xb654) + #define XE2LPM_L3SQCREG5 XE_REG_MCR(0xb658)
#define XE2_TDF_CTRL XE_REG(0xb418) diff --git a/drivers/gpu/drm/xe/xe_tuning.c b/drivers/gpu/drm/xe/xe_tuning.c index c798ae1b3f750..0d5e04158917b 100644 --- a/drivers/gpu/drm/xe/xe_tuning.c +++ b/drivers/gpu/drm/xe/xe_tuning.c @@ -75,6 +75,14 @@ static const struct xe_rtp_entry_sr gt_tunings[] = { XE_RTP_ACTIONS(FIELD_SET(STATELESS_COMPRESSION_CTRL, UNIFIED_COMPRESSION_FORMAT, REG_FIELD_PREP(UNIFIED_COMPRESSION_FORMAT, 0))) }, + { XE_RTP_NAME("Tuning: L3 RW flush all Cache"), + XE_RTP_RULES(GRAPHICS_VERSION(2004)), + XE_RTP_ACTIONS(SET(SCRATCH3_LBCF, RWFLUSHALLEN)) + }, + { XE_RTP_NAME("Tuning: L3 RW flush all cache - media"), + XE_RTP_RULES(MEDIA_VERSION(2000)), + XE_RTP_ACTIONS(SET(XE2LPM_SCRATCH3_LBCF, RWFLUSHALLEN)) + },
{} };
From: Matthew Auld matthew.auld@intel.com
commit fbd73b7d2ae29ef0f604f376bcc22b886a49329e upstream.
Rather extract the mem_type from the current resource. Checking the first potential placement doesn't really tell us where the bo is currently allocated, especially if there are multiple potential placements.
Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Cc: Tejas Upadhyay tejas.upadhyay@intel.com Cc: "Thomas Hellström" thomas.hellstrom@linux.intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Reviewed-by: Tejas Upadhyay tejas.upadhyay@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240911155527.178910-7-matthe... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_drm_client.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_drm_client.c b/drivers/gpu/drm/xe/xe_drm_client.c index c237ced421833..eaf40857c1ac6 100644 --- a/drivers/gpu/drm/xe/xe_drm_client.c +++ b/drivers/gpu/drm/xe/xe_drm_client.c @@ -168,15 +168,10 @@ static void bo_meminfo(struct xe_bo *bo, struct drm_memory_stats stats[TTM_NUM_MEM_TYPES]) { u64 sz = bo->size; - u32 mem_type; + u32 mem_type = bo->ttm.resource->mem_type;
xe_bo_assert_held(bo);
- if (bo->placement.placement) - mem_type = bo->placement.placement->mem_type; - else - mem_type = XE_PL_TT; - if (drm_gem_object_is_shared_for_memory_stats(&bo->ttm.base)) stats[mem_type].shared += sz; else
From: Matthew Brost matthew.brost@intel.com
commit f96dbf7c321d70834d46f3aedb75a671e839b51e upstream.
Closing a VM removes page table memory thus we shouldn't touch page tables when a VM is closed. Do not run the GPU page fault handler once the VM is closed to avoid touching page tables.
Signed-off-by: Matthew Brost matthew.brost@intel.com Reviewed-by: Himal Prasad Ghimiray himal.prasad.ghimiray@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240911011820.825127-1-matthe... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_gt_pagefault.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_gt_pagefault.c b/drivers/gpu/drm/xe/xe_gt_pagefault.c index 730eec07795e2..00af059a8971a 100644 --- a/drivers/gpu/drm/xe/xe_gt_pagefault.c +++ b/drivers/gpu/drm/xe/xe_gt_pagefault.c @@ -212,6 +212,12 @@ static int handle_pagefault(struct xe_gt *gt, struct pagefault *pf) * TODO: Change to read lock? Using write lock for simplicity. */ down_write(&vm->lock); + + if (xe_vm_is_closed(vm)) { + err = -ENOENT; + goto unlock_vm; + } + vma = lookup_vma(vm, pf->page_addr); if (!vma) { err = -EINVAL;
From: Matthew Auld matthew.auld@intel.com
commit 67801fa67b94ebd0e4da7a77ac2d9f321b75fbe0 upstream.
Evil user can guess the next id of the queue before the ioctl completes and then call queue destroy ioctl to trigger UAF since create ioctl is still referencing the same queue. Move the xa_alloc all the way to the end to prevent this.
v2: - Rebase
Fixes: 2149ded63079 ("drm/xe: Fix use after free when client stats are captured") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Matthew Brost matthew.brost@intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Matthew Brost matthew.brost@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240925071426.144015-4-matthe... (cherry picked from commit 16536582ddbebdbdf9e1d7af321bbba2bf955a87) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_exec_queue.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/xe/xe_exec_queue.c b/drivers/gpu/drm/xe/xe_exec_queue.c index 2179c65dc60ab..c031334a6d2a2 100644 --- a/drivers/gpu/drm/xe/xe_exec_queue.c +++ b/drivers/gpu/drm/xe/xe_exec_queue.c @@ -627,12 +627,14 @@ int xe_exec_queue_create_ioctl(struct drm_device *dev, void *data, } }
+ q->xef = xe_file_get(xef); + + /* user id alloc must always be last in ioctl to prevent UAF */ err = xa_alloc(&xef->exec_queue.xa, &id, q, xa_limit_32b, GFP_KERNEL); if (err) goto kill_exec_queue;
args->exec_queue_id = id; - q->xef = xe_file_get(xef);
return 0;
From: Chaitanya Kumar Borah chaitanya.kumar.borah@intel.com
commit 26c85e7f40f9aed4f5f04dcb0ea0bce5d44f6f54 upstream.
In case of UHBR rates, we do not need to explicitly enable FEC by writing to DP_TP_CTL register. For MST use-cases, intel_dp_mst_find_vcpi_slots_for_bpp() takes care of setting fec_enable to false. However, it gets overwritten in intel_dp_dsc_compute_config(). This change keeps fec_enable false across MST and SST use-cases for UHBR rates.
While at it, add a comment explaining why we don't enable FEC in eDP v1.5.
v2: Correct logic to cater to SST use-cases (Jani)
Signed-off-by: Chaitanya Kumar Borah chaitanya.kumar.borah@intel.com Reviewed-by: Imre Deak imre.deak@intel.com Signed-off-by: Suraj Kandpal suraj.kandpal@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240822061448.4085693-1-chait... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/i915/display/intel_dp.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/display/intel_dp.c b/drivers/gpu/drm/i915/display/intel_dp.c index 6ce3d9c4904eb..7c1565e744ed6 100644 --- a/drivers/gpu/drm/i915/display/intel_dp.c +++ b/drivers/gpu/drm/i915/display/intel_dp.c @@ -2205,9 +2205,15 @@ int intel_dp_dsc_compute_config(struct intel_dp *intel_dp, &pipe_config->hw.adjusted_mode; int ret;
+ /* + * Though eDP v1.5 supports FEC with DSC, unlike DP, it is optional. + * Since, FEC is a bandwidth overhead, continue to not enable it for + * eDP. Until, there is a good reason to do so. + */ pipe_config->fec_enable = pipe_config->fec_enable || (!intel_dp_is_edp(intel_dp) && - intel_dp_supports_fec(intel_dp, connector, pipe_config)); + intel_dp_supports_fec(intel_dp, connector, pipe_config) && + !intel_dp_is_uhbr(pipe_config));
if (!intel_dp_supports_dsc(connector, pipe_config)) return -EINVAL;
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
Remove intel_dp_mst_suspend/resume from runtime suspend resume sequences. It is incorrect as it depends on AUX transfers which itself depend on the device being runtime resumed. This is also why we see a lock_dep splat here.
<4> [76.011119] kworker/4:2/192 is trying to acquire lock: <4> [76.011122] ffff8881120b3210 (&mgr->lock#2){+.+.}-{3:3}, at: drm_dp_mst_topology_mgr_suspend+0x33/0xd0 [drm_display_helper] <4> [76.011142] but task is already holding lock: <4> [76.011144] ffffffffa0bc3420 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}, at: xe_pm_runtime_suspend+0x51/0x3f0 [xe] <4> [76.011223] which lock already depends on the new lock. <4> [76.011226] the existing dependency chain (in reverse order) is: <4> [76.011229] -> #2 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}: <4> [76.011233] pm_runtime_lockdep_prime+0x2f/0x50 [xe] <4> [76.011306] xe_pm_runtime_resume_and_get+0x29/0x90 [xe] <4> [76.011377] intel_display_power_get+0x24/0x70 [xe] <4> [76.011466] intel_digital_port_connected_locked+0x4c/0xf0 [xe] <4> [76.011551] intel_dp_aux_xfer+0xb8/0x7c0 [xe] <4> [76.011633] intel_dp_aux_transfer+0x166/0x2e0 [xe] <4> [76.011715] drm_dp_dpcd_access+0x87/0x150 [drm_display_helper] <4> [76.011726] drm_dp_dpcd_probe+0x3d/0xf0 [drm_display_helper] <4> [76.011737] drm_dp_dpcd_read+0xdd/0x130 [drm_display_helper] <4> [76.011747] intel_dp_get_colorimetry_status+0x3a/0x70 [xe] <4> [76.011886] intel_dp_init_connector+0x4ff/0x1030 [xe] <4> [76.011969] intel_ddi_init+0xc5b/0x1030 [xe] <4> [76.012058] intel_bios_for_each_encoder+0x36/0x60 [xe] <4> [76.012145] intel_setup_outputs+0x201/0x460 [xe] <4> [76.012233] intel_display_driver_probe_nogem+0x155/0x1e0 [xe] <4> [76.012320] xe_display_init_noaccel+0x27/0x70 [xe]
Signed-off-by: Suraj Kandpal suraj.kandpal@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240912012545.702032-2-suraj.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index 696e3cd716991..5aeb672f329de 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -325,7 +325,8 @@ static void __xe_display_pm_suspend(struct xe_device *xe, bool runtime)
xe_display_flush_cleanup_work(xe);
- intel_dp_mst_suspend(xe); + if (!runtime) + intel_dp_mst_suspend(xe);
intel_hpd_cancel_work(xe);
@@ -398,7 +399,9 @@ static void __xe_display_pm_resume(struct xe_device *xe, bool runtime) intel_display_driver_resume_access(xe);
/* MST sideband requires HPD interrupts enabled */ - intel_dp_mst_resume(xe); + if (!runtime) + intel_dp_mst_resume(xe); + if (!runtime && has_display(xe)) { intel_display_driver_resume(xe); drm_kms_helper_poll_enable(&xe->drm);
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
We can't take patches for only one stable branch, so please fix this series up for 6.12.y as well, and then we can consider it for 6.11.y.
Also note that 6.11.y will only be alive for maybe one more week...
thanks,
greg k-h
On Mon, Dec 02, 2024 at 10:50:14AM +0100, Greg KH wrote:
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
oops, it should be in 6.12.
Rodrigo/Suraj why doesn't this patch have the proper Fixes trailer?
We can't take patches for only one stable branch, so please fix this series up for 6.12.y as well, and then we can consider it for 6.11.y.
all these patches should already be in 6.12... I will take a look again to make sure we aren't missing patches there.
Also note that 6.11.y will only be alive for maybe one more week...
ok, then maybe the distros still using 6.11 will need to pick these downstream or move on.
Lucas De Marchi
thanks,
greg k-h
On Mon, Dec 02, 2024 at 08:40:34AM -0600, Lucas De Marchi wrote:
On Mon, Dec 02, 2024 at 10:50:14AM +0100, Greg KH wrote:
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
oops, it should be in 6.12.
Rodrigo/Suraj why doesn't this patch have the proper Fixes trailer?
We can't take patches for only one stable branch, so please fix this series up for 6.12.y as well, and then we can consider it for 6.11.y.
all these patches should already be in 6.12... I will take a look again to make sure we aren't missing patches there.
Also note that 6.11.y will only be alive for maybe one more week...
ok, then maybe the distros still using 6.11 will need to pick these downstream or move on.
I think most will have moved on by now, do you know any that are sticking with 6.11.y?
thanks,
greg k-h
On Mon, Dec 02, 2024 at 03:47:14PM +0100, Greg KH wrote:
On Mon, Dec 02, 2024 at 08:40:34AM -0600, Lucas De Marchi wrote:
On Mon, Dec 02, 2024 at 10:50:14AM +0100, Greg KH wrote:
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
oops, it should be in 6.12.
Rodrigo/Suraj why doesn't this patch have the proper Fixes trailer?
We can't take patches for only one stable branch, so please fix this series up for 6.12.y as well, and then we can consider it for 6.11.y.
all these patches should already be in 6.12... I will take a look again to make sure we aren't missing patches there.
Also note that 6.11.y will only be alive for maybe one more week...
ok, then maybe the distros still using 6.11 will need to pick these downstream or move on.
I think most will have moved on by now, do you know any that are sticking with 6.11.y?
From https://ubuntu.com/kernel/lifecycle it seems 6.11 EOL will be Jul/2025. They already have most of these patches, but not all. My intention was to migrate the fixes they've got to benefit all the 6.11 users... if other distros migrate to 6.12, then I believe this is not needed.
Lucas De Marchi
thanks,
greg k-h
On Mon, Dec 02, 2024 at 08:40:34AM -0600, Lucas De Marchi wrote:
On Mon, Dec 02, 2024 at 10:50:14AM +0100, Greg KH wrote:
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
oops, it should be in 6.12.
Rodrigo/Suraj why doesn't this patch have the proper Fixes trailer?
hmm, missed fixes tag indeed, sorry...
But it is already in v6.13-rc1, so it should be enough to get to 6.12 and 6.11, no?!
We can't take patches for only one stable branch, so please fix this series up for 6.12.y as well, and then we can consider it for 6.11.y.
all these patches should already be in 6.12... I will take a look again to make sure we aren't missing patches there.
Also note that 6.11.y will only be alive for maybe one more week...
ok, then maybe the distros still using 6.11 will need to pick these downstream or move on.
Lucas De Marchi
thanks,
greg k-h
On Mon, Dec 02, 2024 at 11:35:38AM -0500, Rodrigo Vivi wrote:
On Mon, Dec 02, 2024 at 08:40:34AM -0600, Lucas De Marchi wrote:
On Mon, Dec 02, 2024 at 10:50:14AM +0100, Greg KH wrote:
On Fri, Nov 22, 2024 at 01:07:15PM -0800, Lucas De Marchi wrote:
From: Suraj Kandpal suraj.kandpal@intel.com
commit 47382485baa781b68622d94faa3473c9a235f23e upstream.
But this is not in 6.12, so why apply it only to 6.11?
oops, it should be in 6.12.
Rodrigo/Suraj why doesn't this patch have the proper Fixes trailer?
hmm, missed fixes tag indeed, sorry...
But it is already in v6.13-rc1, so it should be enough to get to 6.12 and 6.11, no?!
yes, but we will need to submit it as it will (likely) not be picked automatically.
Lucas De Marchi
From: Suraj Kandpal suraj.kandpal@intel.com
commit 5422d30957570b0f0283f8ad4d0dd45637c11db7 upstream.
Do not do intel_fbdev_set_suspend during runtime_suspend/resume functions. This cause a big circular lock_dep splat.
kworker/0:4/198 is trying to acquire lock: <4> [77.185594] ffffffff83398500 (console_lock){+.+.}-{0:0}, at: intel_fbdev_set_suspend+0x169/0x1f0 [xe] <4> [77.185947] but task is already holding lock: <4> [77.185949] ffffffffa09e9460 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}, at: xe_pm_runtime_suspend+0x51/0x3f0 [xe] <4> [77.186262] which lock already depends on the new lock. <4> [77.186264] the existing dependency chain (in reverse order) is: <4> [77.186266] -> #2 (xe_pm_runtime_lockdep_map){+.+.}-{0:0}: <4> [77.186276] pm_runtime_lockdep_prime+0x2f/0x50 [xe] <4> [77.186572] xe_pm_runtime_resume_and_get+0x29/0x90 [xe] <4> [77.186867] intelfb_create+0x150/0x390 [xe] <4> [77.187197] __drm_fb_helper_initial_config_and_unlock+0x31c/0x5e0 [drm_kms_helper] <4> [77.187243] drm_fb_helper_initial_config+0x3d/0x50 [drm_kms_helper] <4> [77.187274] intel_fbdev_client_hotplug+0xb1/0x140 [xe] <4> [77.187603] drm_client_register+0x87/0xd0 [drm] <4> [77.187704] intel_fbdev_setup+0x51c/0x640 [xe] <4> [77.188033] intel_display_driver_register+0xb7/0xf0 [xe] <4> [77.188438] xe_display_register+0x21/0x40 [xe] <4> [77.188809] xe_device_probe+0xa8d/0xbf0 [xe] <4> [77.189035] xe_pci_probe+0x333/0x5b0 [xe] <4> [77.189330] local_pci_probe+0x48/0xb0 <4> [77.189341] pci_device_probe+0xc8/0x280 <4> [77.189351] really_probe+0xf8/0x390 <4> [77.189362] __driver_probe_device+0x8a/0x170 <4> [77.189373] driver_probe_device+0x23/0xb0
Signed-off-by: Suraj Kandpal suraj.kandpal@intel.com Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240912012545.702032-3-suraj.... Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/display/xe_display.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c index 5aeb672f329de..b778b9748aceb 100644 --- a/drivers/gpu/drm/xe/display/xe_display.c +++ b/drivers/gpu/drm/xe/display/xe_display.c @@ -316,7 +316,9 @@ static void __xe_display_pm_suspend(struct xe_device *xe, bool runtime) * properly. */ intel_power_domains_disable(xe); - intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_SUSPENDED, true); + if (!runtime) + intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_SUSPENDED, true); + if (!runtime && has_display(xe)) { drm_kms_helper_poll_disable(&xe->drm); intel_display_driver_disable_user_access(xe); @@ -413,7 +415,8 @@ static void __xe_display_pm_resume(struct xe_device *xe, bool runtime)
intel_opregion_resume(xe);
- intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_RUNNING, false); + if (!runtime) + intel_fbdev_set_suspend(&xe->drm, FBINFO_STATE_RUNNING, false);
intel_power_domains_enable(xe); }
From: Matthew Auld matthew.auld@intel.com
commit 6df106e93f79fb7dc90546a2d93bb3776b42863e upstream.
The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled on for manual global invalidation to take effect and actually flush device cache, however this also turns on flushing for things like pipecontrol, which occurs between submissions for compute/render. This sounds like massive overkill for our needs, where we already have the manual flushing on the display side with the global invalidation. Some observations on BMG:
1. Disabling l2 caching for host writes and stubbing out the driver global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has no impact on wb-transient-vs-display IGT, which makes sense since the pipecontrol is now flushing the device cache after the render copy. Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also expected since device cache is now dirty and display engine can't see the writes.
2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global invalidation also has no impact on wb-transient-vs-display. This suggests that the global invalidation still works as expected and is flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on.
With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since we no longer flush the device cache between submissions as part of pipecontrol.
Edit: We now also have clarification from HW side that BSpec was indeed wrong here.
v2: - Rebase and update commit message.
BSpec: 71718 Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Vitasta Wattal vitasta.wattal@intel.com Cc: Matt Roper matthew.d.roper@intel.com Cc: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew... (cherry picked from commit 67ec9f87bd6c57db1251bb2244d242f7ca5a0b6a) [ Fix conflict due to changed xe_mmio_write32() signature ] Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/regs/xe_gt_regs.h | 3 --- drivers/gpu/drm/xe/xe_gt.c | 1 - 2 files changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/xe/regs/xe_gt_regs.h b/drivers/gpu/drm/xe/regs/xe_gt_regs.h index 224ab4a425258..bd604b9f08e4f 100644 --- a/drivers/gpu/drm/xe/regs/xe_gt_regs.h +++ b/drivers/gpu/drm/xe/regs/xe_gt_regs.h @@ -393,9 +393,6 @@
#define XE2_GLOBAL_INVAL XE_REG(0xb404)
-#define SCRATCH1LPFC XE_REG(0xb474) -#define EN_L3_RW_CCS_CACHE_FLUSH REG_BIT(0) - #define XE2LPM_L3SQCREG2 XE_REG_MCR(0xb604)
#define XE2LPM_L3SQCREG3 XE_REG_MCR(0xb608) diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c index ba9f50c1faa67..a4a5c012a1b0b 100644 --- a/drivers/gpu/drm/xe/xe_gt.c +++ b/drivers/gpu/drm/xe/xe_gt.c @@ -108,7 +108,6 @@ static void xe_gt_enable_host_l2_vram(struct xe_gt *gt) return;
if (!xe_gt_is_media_type(gt)) { - xe_mmio_write32(gt, SCRATCH1LPFC, EN_L3_RW_CCS_CACHE_FLUSH); reg = xe_gt_mcr_unicast_read_any(gt, XE2_GAMREQSTRM_CTRL); reg |= CG_DIS_CNTLBUS; xe_gt_mcr_multicast_write(gt, XE2_GAMREQSTRM_CTRL, reg);
From: Aradhya Bhatia aradhya.bhatia@intel.com
commit 4ceead37ca9f5e555fe46e8528bd14dd1d2728e8 upstream.
Add workaround (wa) 15016589081 which applies to Xe2_v3_LPG_MD.
Xe2_v3_LPG_MD is a Lunar Lake platform with GFX version: 20.04. This wa is type: permanent, and hence is applicable on all steppings.
Signed-off-by: Aradhya Bhatia aradhya.bhatia@intel.com Reviewed-by: Tejas Upadhyay tejas.upadhyay@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20241009065542.283151-1-aradhy... (cherry picked from commit 8fb1da9f9bfb02f710a7f826d50781b0b030cf53) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_wa.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/xe/xe_wa.c b/drivers/gpu/drm/xe/xe_wa.c index e2d7ccc6f144b..28c514b2aa3a1 100644 --- a/drivers/gpu/drm/xe/xe_wa.c +++ b/drivers/gpu/drm/xe/xe_wa.c @@ -710,6 +710,10 @@ static const struct xe_rtp_entry_sr lrc_was[] = { DIS_PARTIAL_AUTOSTRIP | DIS_AUTOSTRIP)) }, + { XE_RTP_NAME("15016589081"), + XE_RTP_RULES(GRAPHICS_VERSION(2004), ENGINE_CLASS(RENDER)), + XE_RTP_ACTIONS(SET(CHICKEN_RASTER_1, DIS_CLIP_NEGATIVE_BOUNDING_BOX)) + },
/* Xe2_HPG */ { XE_RTP_NAME("15010599737"),
From: He Lugang helugang@uniontech.com
commit cb58977016d1b25781743e5fbe6a545493785e37 upstream.
Use devm_add_action_or_reset() to release resources in case of failure, because the cleanup function will be automatically called.
Reviewed-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: He Lugang helugang@uniontech.com Link: https://patchwork.freedesktop.org/patch/msgid/9631BC17D1E028A2+2024091110221... Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com (cherry picked from commit fdc81c43f0c14ace6383024a02585e3fcbd1ceba) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com --- drivers/gpu/drm/xe/xe_gt_freq.c | 4 ++-- drivers/gpu/drm/xe/xe_gt_sysfs.c | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_gt_freq.c b/drivers/gpu/drm/xe/xe_gt_freq.c index 68a5778b4319f..ab76973f3e1e6 100644 --- a/drivers/gpu/drm/xe/xe_gt_freq.c +++ b/drivers/gpu/drm/xe/xe_gt_freq.c @@ -237,11 +237,11 @@ int xe_gt_freq_init(struct xe_gt *gt) if (!gt->freq) return -ENOMEM;
- err = devm_add_action(xe->drm.dev, freq_fini, gt->freq); + err = sysfs_create_files(gt->freq, freq_attrs); if (err) return err;
- err = sysfs_create_files(gt->freq, freq_attrs); + err = devm_add_action_or_reset(xe->drm.dev, freq_fini, gt->freq); if (err) return err;
diff --git a/drivers/gpu/drm/xe/xe_gt_sysfs.c b/drivers/gpu/drm/xe/xe_gt_sysfs.c index a05c3699e8b91..ec2b8246204b8 100644 --- a/drivers/gpu/drm/xe/xe_gt_sysfs.c +++ b/drivers/gpu/drm/xe/xe_gt_sysfs.c @@ -51,5 +51,5 @@ int xe_gt_sysfs_init(struct xe_gt *gt)
gt->sysfs = &kg->base;
- return devm_add_action(xe->drm.dev, gt_sysfs_fini, gt); + return devm_add_action_or_reset(xe->drm.dev, gt_sysfs_fini, gt); }
linux-stable-mirror@lists.linaro.org