Backports of fixes for v5.2.
Chris Wilson (3): drm/i915: Make the semaphore saturation mask global drm/i915/userptr: Acquire the page lock around set_page_dirty() drm/i915: Don't dereference request if it may have been retired when printing
John Harrison (1): drm/i915: Support flags in whitlist WAs
Kenneth Graunke (1): drm/i915: Disable SAMPLER_STATE prefetching on all Gen11 steppings.
Lionel Landwerlin (3): drm/i915/perf: fix ICL perf register offsets drm/i915: whitelist PS_(DEPTH|INVOCATION)_COUNT drm/i915/icl: whitelist PS_(DEPTH|INVOCATION)_COUNT
drivers/gpu/drm/i915/i915_gem_userptr.c | 10 ++++- drivers/gpu/drm/i915/i915_perf.c | 10 +++-- drivers/gpu/drm/i915/i915_reg.h | 7 ++++ drivers/gpu/drm/i915/i915_request.c | 4 +- drivers/gpu/drm/i915/intel_context.c | 1 - drivers/gpu/drm/i915/intel_context_types.h | 2 - drivers/gpu/drm/i915/intel_engine_cs.c | 17 +++++---- drivers/gpu/drm/i915/intel_engine_types.h | 2 + drivers/gpu/drm/i915/intel_workarounds.c | 43 ++++++++++++++++++++-- 9 files changed, 77 insertions(+), 19 deletions(-)
From: Chris Wilson chris@chris-wilson.co.uk
The idea behind keeping the saturation mask local to a context backfired spectacularly. The premise with the local mask was that we would be more proactive in attempting to use semaphores after each time the context idled, and that all new contexts would attempt to use semaphores ignoring the current state of the system. This turns out to be horribly optimistic. If the system state is still oversaturated and the existing workloads have all stopped using semaphores, the new workloads would attempt to use semaphores and be deprioritised behind real work. The new contexts would not switch off using semaphores until their initial batch of low priority work had completed. Given sufficient backload load of equal user priority, this would completely starve the new work of any GPU time.
To compensate, remove the local tracking in favour of keeping it as global state on the engine -- once the system is saturated and semaphores are disabled, everyone stops attempting to use semaphores until the system is idle again. One of the reason for preferring local context tracking was that it worked with virtual engines, so for switching to global state we could either do a complete check of all the virtual siblings or simply disable semaphores for those requests. This takes the simpler approach of disabling semaphores on virtual engines.
The downside is that the decision that the engine is saturated is a local measure -- we are only checking whether or not this context was scheduled in a timely fashion, it may be legitimately delayed due to user priorities. We still have the same dilemma though, that we do not want to employ the semaphore poll unless it will be used.
v2: Explain why we need to assume the worst wrt virtual engines.
Fixes: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Dmitry Rogozhkin dmitry.v.rogozhkin@intel.com Cc: Dmitry Ermilov dmitry.ermilov@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@c... (cherry picked from commit 44d89409a12eb8333735958509d7d591b461d13d) Cc: stable@vger.kernel.org Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/i915_request.c | 4 ++-- drivers/gpu/drm/i915/intel_context.c | 1 - drivers/gpu/drm/i915/intel_context_types.h | 2 -- drivers/gpu/drm/i915/intel_engine_cs.c | 1 + drivers/gpu/drm/i915/intel_engine_types.h | 2 ++ 5 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index c88e538b2ef4..81b48e273cbd 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -443,7 +443,7 @@ void __i915_request_submit(struct i915_request *request) */ if (request->sched.semaphores && i915_sw_fence_signaled(&request->semaphore)) - request->hw_context->saturated |= request->sched.semaphores; + engine->saturated |= request->sched.semaphores;
/* We may be recursing from the signal callback of another i915 fence */ spin_lock_nested(&request->lock, SINGLE_DEPTH_NESTING); @@ -829,7 +829,7 @@ already_busywaiting(struct i915_request *rq) * * See the are-we-too-late? check in __i915_request_submit(). */ - return rq->sched.semaphores | rq->hw_context->saturated; + return rq->sched.semaphores | rq->engine->saturated; }
static int diff --git a/drivers/gpu/drm/i915/intel_context.c b/drivers/gpu/drm/i915/intel_context.c index 924cc556223a..8931e0fee873 100644 --- a/drivers/gpu/drm/i915/intel_context.c +++ b/drivers/gpu/drm/i915/intel_context.c @@ -230,7 +230,6 @@ intel_context_init(struct intel_context *ce, ce->gem_context = ctx; ce->engine = engine; ce->ops = engine->cops; - ce->saturated = 0;
INIT_LIST_HEAD(&ce->signal_link); INIT_LIST_HEAD(&ce->signals); diff --git a/drivers/gpu/drm/i915/intel_context_types.h b/drivers/gpu/drm/i915/intel_context_types.h index 339c7437fe82..fd47b9d49e09 100644 --- a/drivers/gpu/drm/i915/intel_context_types.h +++ b/drivers/gpu/drm/i915/intel_context_types.h @@ -59,8 +59,6 @@ struct intel_context { atomic_t pin_count; struct mutex pin_mutex; /* guards pinning and associated on-gpuing */
- intel_engine_mask_t saturated; /* submitting semaphores too late? */ - /** * active_tracker: Active tracker for the external rq activity * on this intel_context object. diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index eea9bec04f1b..9d4f12e982c3 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1200,6 +1200,7 @@ void intel_engines_park(struct drm_i915_private *i915)
i915_gem_batch_pool_fini(&engine->batch_pool); engine->execlists.no_priolist = false; + engine->saturated = 0; }
i915->gt.active_engines = 0; diff --git a/drivers/gpu/drm/i915/intel_engine_types.h b/drivers/gpu/drm/i915/intel_engine_types.h index 1f970c76b6a6..4270ddb45f41 100644 --- a/drivers/gpu/drm/i915/intel_engine_types.h +++ b/drivers/gpu/drm/i915/intel_engine_types.h @@ -285,6 +285,8 @@ struct intel_engine_cs { struct intel_context *kernel_context; /* pinned */ struct intel_context *preempt_context; /* pinned; optional */
+ intel_engine_mask_t saturated; /* submitting semaphores too late? */ + struct drm_i915_gem_object *default_state; void *pinned_default_state;
From: Lionel Landwerlin lionel.g.landwerlin@intel.com
We got the wrong offsets (could they have changed?). New values were computed off an error state by looking up the register offset in the context image as written by the HW.
Signed-off-by: Lionel Landwerlin lionel.g.landwerlin@intel.com Fixes: 1de401c08fa805 ("drm/i915/perf: enable perf support on ICL") Acked-by: Kenneth Graunke kenneth@whitecape.org Link: https://patchwork.freedesktop.org/patch/msgid/20190610081914.25428-1-lionel.... (cherry picked from commit 8dcfdfb4501012a8d36d2157dc73925715f2befb) Cc: stable@vger.kernel.org Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/i915_perf.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index dc4ce694c06a..235aedc62b4c 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3457,9 +3457,13 @@ void i915_perf_init(struct drm_i915_private *dev_priv) dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set; dev_priv->perf.oa.ops.disable_metric_set = gen10_disable_metric_set;
- dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128; - dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de; - + if (IS_GEN(dev_priv, 10)) { + dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128; + dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de; + } else { + dev_priv->perf.oa.ctx_oactxctrl_offset = 0x124; + dev_priv->perf.oa.ctx_flexeu0_offset = 0x78e; + } dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16); } }
On Fri, Jul 26, 2019 at 10:35:50AM +0300, Joonas Lahtinen wrote:
From: Lionel Landwerlin lionel.g.landwerlin@intel.com
We got the wrong offsets (could they have changed?). New values were computed off an error state by looking up the register offset in the context image as written by the HW.
Signed-off-by: Lionel Landwerlin lionel.g.landwerlin@intel.com Fixes: 1de401c08fa805 ("drm/i915/perf: enable perf support on ICL") Acked-by: Kenneth Graunke kenneth@whitecape.org Link: https://patchwork.freedesktop.org/patch/msgid/20190610081914.25428-1-lionel.... (cherry picked from commit 8dcfdfb4501012a8d36d2157dc73925715f2befb)
This commit id is not in Linus's tree :(
From: Chris Wilson chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock for the benefit of asynchronous memory errors who prefer a consistent dirty state. This rule can be broken in some special cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty calls for shmemfs/anonymous memory. Userptr may be used with real mappings and so needs to use the locked version (set_page_dirty_lock).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") References: 6dcc693bc57f ("ext4: warn when page is dirtied without buffers") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190708140327.26825-1-chris@c... (cherry picked from commit cb6d7c7dc7ff8cace666ddec66334117a6068ce2) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/i915_gem_userptr.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c index 8079ea3af103..b1fc15c7f599 100644 --- a/drivers/gpu/drm/i915/i915_gem_userptr.c +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c @@ -678,7 +678,15 @@ i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj,
for_each_sgt_page(page, sgt_iter, pages) { if (obj->mm.dirty) - set_page_dirty(page); + /* + * As this may not be anonymous memory (e.g. shmem) + * but exist on a real mapping, we have to lock + * the page in order to dirty it -- holding + * the page reference is not sufficient to + * prevent the inode from being truncated. + * Play safe and take the lock. + */ + set_page_dirty_lock(page);
mark_page_accessed(page); put_page(page);
On Fri, Jul 26, 2019 at 10:35:51AM +0300, Joonas Lahtinen wrote:
From: Chris Wilson chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock for the benefit of asynchronous memory errors who prefer a consistent dirty state. This rule can be broken in some special cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty calls for shmemfs/anonymous memory. Userptr may be used with real mappings and so needs to use the locked version (set_page_dirty_lock).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") References: 6dcc693bc57f ("ext4: warn when page is dirtied without buffers") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190708140327.26825-1-chris@c... (cherry picked from commit cb6d7c7dc7ff8cace666ddec66334117a6068ce2)
This commit id is not in Linus's tree.
I've stopped here, and dropped patch 1/8 as well. Please fix these all up to have the correct git ids and resend.
thanks,
greg k-h
Quoting Greg KH (2019-07-29 20:49:42)
On Fri, Jul 26, 2019 at 10:35:51AM +0300, Joonas Lahtinen wrote:
From: Chris Wilson chris@chris-wilson.co.uk
set_page_dirty says:
For pages with a mapping this should be done under the page lock for the benefit of asynchronous memory errors who prefer a consistent dirty state. This rule can be broken in some special cases, but should be better not to.
Under those rules, it is only safe for us to use the plain set_page_dirty calls for shmemfs/anonymous memory. Userptr may be used with real mappings and so needs to use the locked version (set_page_dirty_lock).
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317 Fixes: 5cc9ed4b9a7a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl") References: 6dcc693bc57f ("ext4: warn when page is dirtied without buffers") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190708140327.26825-1-chris@c... (cherry picked from commit cb6d7c7dc7ff8cace666ddec66334117a6068ce2)
This commit id is not in Linus's tree.
I've stopped here, and dropped patch 1/8 as well. Please fix these all up to have the correct git ids and resend.
My bad. I was first requested to send an extra drm-intel-fixes PR, and then to send directly to stable.
Now that the pull didn't get accepted, naturaly the IDs don't exist.
I'll send a fixed series.
Regards, Joonas
From: Kenneth Graunke kenneth@whitecape.org
The Demand Prefetch workaround (binding table prefetching) only applies to Icelake A0/B0. But the Sampler Prefetch workaround needs to be applied to all Gen11 steppings, according to a programming note in the SARCHKMD documentation.
Using the Intel Gallium driver, I have seen intermittent failures in the dEQP-GLES31.functional.copy_image.non_compressed.* tests. After applying this workaround, the tests reliably pass.
v2: Remove the overlap with a pre-production w/a
BSpec: 9663 Signed-off-by: Kenneth Graunke kenneth@whitecape.org Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Cc: stable@vger.kernel.org Reviewed-by: Mika Kuoppala mika.kuoppala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190625090655.19220-1-chris@c... (cherry picked from commit f9a393875d3af13cc3267477746608dadb7f17c1) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/intel_workarounds.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c index 841b8e515f4d..2fb70fab2d1c 100644 --- a/drivers/gpu/drm/i915/intel_workarounds.c +++ b/drivers/gpu/drm/i915/intel_workarounds.c @@ -1167,8 +1167,12 @@ rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal) if (IS_ICL_REVID(i915, ICL_REVID_A0, ICL_REVID_B0)) wa_write_or(wal, GEN7_SARCHKMD, - GEN7_DISABLE_DEMAND_PREFETCH | - GEN7_DISABLE_SAMPLER_PREFETCH); + GEN7_DISABLE_DEMAND_PREFETCH); + + /* Wa_1606682166:icl */ + wa_write_or(wal, + GEN7_SARCHKMD, + GEN7_DISABLE_SAMPLER_PREFETCH); }
if (IS_GEN_RANGE(i915, 9, 11)) {
From: John Harrison John.C.Harrison@Intel.com
Newer hardware adds flags to the whitelist work-around register. These allow per access direction privileges and ranges.
Signed-off-by: John Harrison John.C.Harrison@Intel.com Signed-off-by: Robert M. Fosha robert.m.fosha@intel.com Cc: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190618010108.27499-2-John.C.... (cherry picked from commit 5380d0b781c491d94b4f4690ecf9762c1946c4ec) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/i915_reg.h | 7 +++++++ drivers/gpu/drm/i915/intel_workarounds.c | 9 ++++++++- 2 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 13d6bd4e17b2..cf748b80e640 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -2510,6 +2510,13 @@ enum i915_power_well_id { #define RING_WAIT_SEMAPHORE (1 << 10) /* gen6+ */
#define RING_FORCE_TO_NONPRIV(base, i) _MMIO(((base) + 0x4D0) + (i) * 4) +#define RING_FORCE_TO_NONPRIV_RW (0 << 28) /* CFL+ & Gen11+ */ +#define RING_FORCE_TO_NONPRIV_RD (1 << 28) +#define RING_FORCE_TO_NONPRIV_WR (2 << 28) +#define RING_FORCE_TO_NONPRIV_RANGE_1 (0 << 0) /* CFL+ & Gen11+ */ +#define RING_FORCE_TO_NONPRIV_RANGE_4 (1 << 0) +#define RING_FORCE_TO_NONPRIV_RANGE_16 (2 << 0) +#define RING_FORCE_TO_NONPRIV_RANGE_64 (3 << 0) #define RING_MAX_NONPRIV_SLOTS 12
#define GEN7_TLB_RD_ADDR _MMIO(0x4700) diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c index 2fb70fab2d1c..1db826b12774 100644 --- a/drivers/gpu/drm/i915/intel_workarounds.c +++ b/drivers/gpu/drm/i915/intel_workarounds.c @@ -981,7 +981,7 @@ bool intel_gt_verify_workarounds(struct drm_i915_private *i915, }
static void -whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) +whitelist_reg_ext(struct i915_wa_list *wal, i915_reg_t reg, u32 flags) { struct i915_wa wa = { .reg = reg @@ -990,9 +990,16 @@ whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) if (GEM_DEBUG_WARN_ON(wal->count >= RING_MAX_NONPRIV_SLOTS)) return;
+ wa.reg.reg |= flags; _wa_add(wal, &wa); }
+static void +whitelist_reg(struct i915_wa_list *wal, i915_reg_t reg) +{ + whitelist_reg_ext(wal, reg, RING_FORCE_TO_NONPRIV_RW); +} + static void gen9_whitelist_build(struct i915_wa_list *w) { /* WaVFEStateAfterPipeControlwithMediaStateClear:skl,bxt,glk,cfl */
From: Lionel Landwerlin lionel.g.landwerlin@intel.com
CFL:C0+ changed the status of those registers which are now blacklisted by default.
This is breaking a number of CTS tests on GL & Vulkan :
KHR-GL45.pipeline_statistics_query_tests_ARB.functional_fragment_shader_invocations (GL)
dEQP-VK.query_pool.statistics_query.fragment_shader_invocations.* (Vulkan)
v2: Only use one whitelist entry (Lionel)
Bspec: 14091 Signed-off-by: Lionel Landwerlin lionel.g.landwerlin@intel.com Cc: stable@vger.kernel.org Acked-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20190628120720.21682-3-lionel.... (cherry picked from commit 2c903da50f5a9522b134e488bd0f92646c46f3c0) Cc: stable@vger.kernel.org # 6883eab27481: drm/i915: Support flags in whitlist WAs Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/intel_workarounds.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c index 1db826b12774..bd964fbc667b 100644 --- a/drivers/gpu/drm/i915/intel_workarounds.c +++ b/drivers/gpu/drm/i915/intel_workarounds.c @@ -1044,6 +1044,19 @@ static void glk_whitelist_build(struct i915_wa_list *w) static void cfl_whitelist_build(struct i915_wa_list *w) { gen9_whitelist_build(w); + + /* + * WaAllowPMDepthAndInvocationCountAccessFromUMD:cfl,whl,cml,aml + * + * This covers 4 register which are next to one another : + * - PS_INVOCATION_COUNT + * - PS_INVOCATION_COUNT_UDW + * - PS_DEPTH_COUNT + * - PS_DEPTH_COUNT_UDW + */ + whitelist_reg_ext(w, PS_INVOCATION_COUNT, + RING_FORCE_TO_NONPRIV_RD | + RING_FORCE_TO_NONPRIV_RANGE_4); }
static void cnl_whitelist_build(struct i915_wa_list *w)
From: Lionel Landwerlin lionel.g.landwerlin@intel.com
The same tests failing on CFL+ platforms are also failing on ICL. Documentation doesn't list the WaAllowPMDepthAndInvocationCountAccessFromUMD workaround for ICL but applying it fixes the same tests as CFL.
v2: Use only one whitelist entry (Lionel)
Signed-off-by: Lionel Landwerlin lionel.g.landwerlin@intel.com Tested-by: Anuj Phogat anuj.phogat@gmail.com Cc: stable@vger.kernel.org Acked-by: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20190628120720.21682-4-lionel.... (cherry picked from commit 3fe0107e45ab396342497e06b8924cdd485cde3b) Cc: stable@vger.kernel.org # 6883eab27481: drm/i915: Support flags in whitlist WAs Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/intel_workarounds.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/i915/intel_workarounds.c b/drivers/gpu/drm/i915/intel_workarounds.c index bd964fbc667b..5e54de630ef1 100644 --- a/drivers/gpu/drm/i915/intel_workarounds.c +++ b/drivers/gpu/drm/i915/intel_workarounds.c @@ -1075,6 +1075,19 @@ static void icl_whitelist_build(struct i915_wa_list *w)
/* WaEnableStateCacheRedirectToCS:icl */ whitelist_reg(w, GEN9_SLICE_COMMON_ECO_CHICKEN1); + + /* + * WaAllowPMDepthAndInvocationCountAccessFromUMD:icl + * + * This covers 4 register which are next to one another : + * - PS_INVOCATION_COUNT + * - PS_INVOCATION_COUNT_UDW + * - PS_DEPTH_COUNT + * - PS_DEPTH_COUNT_UDW + */ + whitelist_reg_ext(w, PS_INVOCATION_COUNT, + RING_FORCE_TO_NONPRIV_RD | + RING_FORCE_TO_NONPRIV_RANGE_4); }
void intel_engine_init_whitelist(struct intel_engine_cs *engine)
From: Chris Wilson chris@chris-wilson.co.uk
This has caught me out on countless occasions, when we retrieve a pointer from the submission/execlists backend, it does not carry a reference to the context or ring. Those are only pinned while the request is active, so if we see the request is already completed, it may be in the process of being retired and those pointers defunct.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110938 Fixes: 3a068721a973 ("drm/i915: Show ring->start for the ELSP context/request queue") Signed-off-by: Chris Wilson chris@chris-wilson.co.uk Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20190618161951.28820-2-chris@c... (cherry picked from commit eca153603f2f020e15d071918e0daf1d56c17d29) Cc: stable@vger.kernel.org Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com --- drivers/gpu/drm/i915/intel_engine_cs.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c index 9d4f12e982c3..ecc0458a4a8d 100644 --- a/drivers/gpu/drm/i915/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/intel_engine_cs.c @@ -1349,12 +1349,13 @@ static void hexdump(struct drm_printer *m, const void *buf, size_t len) } }
-static void intel_engine_print_registers(const struct intel_engine_cs *engine, +static void intel_engine_print_registers(struct intel_engine_cs *engine, struct drm_printer *m) { struct drm_i915_private *dev_priv = engine->i915; const struct intel_engine_execlists * const execlists = &engine->execlists; + unsigned long flags; u64 addr;
if (engine->id == RCS0 && IS_GEN_RANGE(dev_priv, 4, 7)) @@ -1435,15 +1436,16 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine, idx, hws[idx * 2], hws[idx * 2 + 1]); }
- rcu_read_lock(); + spin_lock_irqsave(&engine->timeline.lock, flags); for (idx = 0; idx < execlists_num_ports(execlists); idx++) { struct i915_request *rq; unsigned int count; + char hdr[80];
rq = port_unpack(&execlists->port[idx], &count); - if (rq) { - char hdr[80]; - + if (!rq) { + drm_printf(m, "\t\tELSP[%d] idle\n", idx); + } else if (!i915_request_signaled(rq)) { snprintf(hdr, sizeof(hdr), "\t\tELSP[%d] count=%d, ring:{start:%08x, hwsp:%08x, seqno:%08x}, rq: ", idx, count, @@ -1452,11 +1454,11 @@ static void intel_engine_print_registers(const struct intel_engine_cs *engine, hwsp_seqno(rq)); print_request(m, rq, hdr); } else { - drm_printf(m, "\t\tELSP[%d] idle\n", idx); + print_request(m, rq, "\t\tELSP[%d] rq: "); } } drm_printf(m, "\t\tHW active? 0x%x\n", execlists->active); - rcu_read_unlock(); + spin_unlock_irqrestore(&engine->timeline.lock, flags); } else if (INTEL_GEN(dev_priv) > 6) { drm_printf(m, "\tPP_DIR_BASE: 0x%08x\n", ENGINE_READ(engine, RING_PP_DIR_BASE));
linux-stable-mirror@lists.linaro.org