On Tue, Apr 14, 2020 at 09:13:28AM +0100, Chris Wilson wrote:
Quoting Sultan Alsawaf (2020-04-07 07:26:22)
From: Sultan Alsawaf sultan@kerneltoast.com
The following deadlock exists in i915_active_wait() due to a double lock on ref->mutex (call chain listed in order from top to bottom): i915_active_wait(); mutex_lock_interruptible(&ref->mutex); <-- ref->mutex first acquired i915_active_request_retire(); node_retire(); active_retire(); mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING); <-- DEADLOCK
Fix the deadlock by skipping the second ref->mutex lock when active_retire() is called through i915_active_request_retire().
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org # 5.4.x Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com
Incorrect.
You missed that it cannot retire from inside the wait due to the active reference held on the i915_active for the wait.
The only point it can enter retire from inside i915_active_wait() is via the terminal __active_retire() which releases the mutex in doing so. -Chris
The terminal __active_retire() and rbtree_postorder_for_each_entry_safe() loop retire different objects, so this isn't true.
Sultan