From: Sultan Alsawaf sultan@kerneltoast.com
The retire and active callbacks can run simultaneously, allowing intel_context_pin() and intel_context_unpin() to run at the same time, trashing the ring and page tables. In 5.4, this was more noticeable because intel_ring_unpin() would set ring->vaddr to NULL and cause a clean NULL-pointer-dereference panic, but in newer kernels the use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the retire callback with ref->mutex to complement the active callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- drivers/gpu/drm/i915/i915_active.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index c4048628188a..0478bcf061b5 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -148,8 +148,10 @@ __active_retire(struct i915_active *ref) spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */ + mutex_lock(&ref->mutex); if (ref->retire) ref->retire(ref); + mutex_unlock(&ref->mutex);
/* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref);