From: Sultan Alsawaf sultan@kerneltoast.com
The retire and active callbacks can run simultaneously, allowing intel_context_pin() and intel_context_unpin() to run at the same time, trashing the ring and page tables. In 5.4, this was more noticeable because intel_ring_unpin() would set ring->vaddr to NULL and cause a clean NULL-pointer-dereference panic, but in newer kernels the use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect __intel_context_retire() with active->mutex (i.e., ref->mutex) to complement the active callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- v2: Reduce the scope of the mutex lock to only __intel_context_retire() and mark it as a function that may sleep so it doesn't run in atomic context
drivers/gpu/drm/i915/gt/intel_context.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 57e8a051ddc2..9b9be8058881 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -221,6 +221,7 @@ static void __intel_context_retire(struct i915_active *active)
CE_TRACE(ce, "retire\n");
+ mutex_lock(&active->mutex); set_bit(CONTEXT_VALID_BIT, &ce->flags); if (ce->state) __context_unpin_state(ce->state); @@ -229,6 +230,7 @@ static void __intel_context_retire(struct i915_active *active) __ring_retire(ce->ring);
intel_context_put(ce); + mutex_unlock(&active->mutex); }
static int __intel_context_active(struct i915_active *active) @@ -288,7 +290,8 @@ intel_context_init(struct intel_context *ce, mutex_init(&ce->pin_mutex);
i915_active_init(&ce->active, - __intel_context_active, __intel_context_retire); + __intel_context_active, + i915_active_may_sleep(__intel_context_retire)); }
void intel_context_fini(struct intel_context *ce)