From: Sultan Alsawaf sultan@kerneltoast.com
The retire and active callbacks can run simultaneously, allowing intel_context_pin() and intel_context_unpin() to run at the same time, trashing the ring and page tables. In 5.4, this was more noticeable because intel_ring_unpin() would set ring->vaddr to NULL and cause a clean NULL-pointer-dereference panic, but in newer kernels the use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the retire callback with ref->mutex to complement the active callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- drivers/gpu/drm/i915/i915_active.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index c4048628188a..0478bcf061b5 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -148,8 +148,10 @@ __active_retire(struct i915_active *ref) spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */ + mutex_lock(&ref->mutex); if (ref->retire) ref->retire(ref); + mutex_unlock(&ref->mutex);
/* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref);
From: Sultan Alsawaf sultan@kerneltoast.com
The retire and active callbacks can run simultaneously, allowing intel_context_pin() and intel_context_unpin() to run at the same time, trashing the ring and page tables. In 5.4, this was more noticeable because intel_ring_unpin() would set ring->vaddr to NULL and cause a clean NULL-pointer-dereference panic, but in newer kernels the use-after-free goes unnoticed.
The NULL-pointer-dereference looks like this: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect __intel_context_retire() with active->mutex (i.e., ref->mutex) to complement the active callback and fix the corruption.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- v2: Reduce the scope of the mutex lock to only __intel_context_retire() and mark it as a function that may sleep so it doesn't run in atomic context
drivers/gpu/drm/i915/gt/intel_context.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index 57e8a051ddc2..9b9be8058881 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -221,6 +221,7 @@ static void __intel_context_retire(struct i915_active *active)
CE_TRACE(ce, "retire\n");
+ mutex_lock(&active->mutex); set_bit(CONTEXT_VALID_BIT, &ce->flags); if (ce->state) __context_unpin_state(ce->state); @@ -229,6 +230,7 @@ static void __intel_context_retire(struct i915_active *active) __ring_retire(ce->ring);
intel_context_put(ce); + mutex_unlock(&active->mutex); }
static int __intel_context_active(struct i915_active *active) @@ -288,7 +290,8 @@ intel_context_init(struct intel_context *ce, mutex_init(&ce->pin_mutex);
i915_active_init(&ce->active, - __intel_context_active, __intel_context_retire); + __intel_context_active, + i915_active_may_sleep(__intel_context_retire)); }
void intel_context_fini(struct intel_context *ce)
From: Sultan Alsawaf sultan@kerneltoast.com
Active and retire callbacks can run simultaneously, causing panics and mayhem. The most notable case is with the intel_context_pin/unpin race that causes ring and page table corruption. In 5.4, this race is more noticeable because intel_ring_unpin() sets ring->vaddr to NULL and causes a clean NULL-pointer-dereference panic, but in newer kernels this race goes unnoticed.
Here is an example of a crash caused by this race on 5.4: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the active and retire callbacks with their own mutex to prevent them from running at the same time as one another. This change makes the i915_active_may_sleep() functionality a redundant for its only user, so clean that out as well since it's no longer needed.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- v3: Protecting only intel_context_pin/unpin is insufficient because all the callbacks race with each other, not just those ones. Now, all callbacks are protected from racing with their counterparts with a new mutex lock for reduced scope. This isn't as pretty, but it is unavoidable.
.../gpu/drm/i915/display/intel_frontbuffer.c | 4 +-- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 - drivers/gpu/drm/i915/gt/intel_context.c | 1 - drivers/gpu/drm/i915/gt/intel_engine_pool.c | 1 - drivers/gpu/drm/i915/gt/intel_timeline.c | 1 - drivers/gpu/drm/i915/i915_active.c | 30 ++++++++++++++----- drivers/gpu/drm/i915/i915_active_types.h | 6 +--- drivers/gpu/drm/i915/i915_vma.c | 1 - 8 files changed, 24 insertions(+), 21 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_frontbuffer.c b/drivers/gpu/drm/i915/display/intel_frontbuffer.c index 6cb02c912acc..766325948b5b 100644 --- a/drivers/gpu/drm/i915/display/intel_frontbuffer.c +++ b/drivers/gpu/drm/i915/display/intel_frontbuffer.c @@ -206,7 +206,6 @@ static int frontbuffer_active(struct i915_active *ref) return 0; }
-__i915_active_call static void frontbuffer_retire(struct i915_active *ref) { struct intel_frontbuffer *front = @@ -254,8 +253,7 @@ intel_frontbuffer_get(struct drm_i915_gem_object *obj) kref_init(&front->ref); atomic_set(&front->bits, 0); i915_active_init(&front->write, - frontbuffer_active, - i915_active_may_sleep(frontbuffer_retire)); + frontbuffer_active, frontbuffer_retire);
spin_lock(&i915->fb_tracking.lock); if (rcu_access_pointer(obj->frontbuffer)) { diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 68326ad3b2e0..31791b4ae086 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1057,7 +1057,6 @@ struct context_barrier_task { void *data; };
-__i915_active_call static void cb_retire(struct i915_active *base) { struct context_barrier_task *cb = container_of(base, typeof(*cb), base); diff --git a/drivers/gpu/drm/i915/gt/intel_context.c b/drivers/gpu/drm/i915/gt/intel_context.c index aea992e46c42..964e32cf5856 100644 --- a/drivers/gpu/drm/i915/gt/intel_context.c +++ b/drivers/gpu/drm/i915/gt/intel_context.c @@ -222,7 +222,6 @@ static void __ring_retire(struct intel_ring *ring) i915_active_release(&ring->vma->active); }
-__i915_active_call static void __intel_context_retire(struct i915_active *active) { struct intel_context *ce = container_of(active, typeof(*ce), active); diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pool.c b/drivers/gpu/drm/i915/gt/intel_engine_pool.c index 397186818305..60dde6978511 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pool.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pool.c @@ -61,7 +61,6 @@ static int pool_active(struct i915_active *ref) return 0; }
-__i915_active_call static void pool_retire(struct i915_active *ref) { struct intel_engine_pool_node *node = diff --git a/drivers/gpu/drm/i915/gt/intel_timeline.c b/drivers/gpu/drm/i915/gt/intel_timeline.c index 91debbc97c9a..ecd120e1cf2a 100644 --- a/drivers/gpu/drm/i915/gt/intel_timeline.c +++ b/drivers/gpu/drm/i915/gt/intel_timeline.c @@ -131,7 +131,6 @@ static void __idle_cacheline_free(struct intel_timeline_cacheline *cl) kfree_rcu(cl, rcu); }
-__i915_active_call static void __cacheline_retire(struct i915_active *active) { struct intel_timeline_cacheline *cl = diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index c4048628188a..165216aee122 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -148,8 +148,15 @@ __active_retire(struct i915_active *ref) spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */ - if (ref->retire) - ref->retire(ref); + if (ref->retire) { + if (ref->active) { + mutex_lock(&ref->callback_lock); + ref->retire(ref); + mutex_unlock(&ref->callback_lock); + } else { + ref->retire(ref); + } + }
/* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref); @@ -281,15 +288,15 @@ void __i915_active_init(struct i915_active *ref, struct lock_class_key *mkey, struct lock_class_key *wkey) { - unsigned long bits; - debug_active_init(ref);
ref->flags = 0; ref->active = active; - ref->retire = ptr_unpack_bits(retire, &bits, 2); - if (bits & I915_ACTIVE_MAY_SLEEP) + ref->retire = retire; + if (ref->active && ref->retire) { + mutex_init(&ref->callback_lock); ref->flags |= I915_ACTIVE_RETIRE_SLEEPS; + }
spin_lock_init(&ref->tree_lock); ref->tree = RB_ROOT; @@ -428,8 +435,15 @@ int i915_active_acquire(struct i915_active *ref) return err;
if (likely(!i915_active_acquire_if_busy(ref))) { - if (ref->active) - err = ref->active(ref); + if (ref->active) { + if (ref->retire) { + mutex_lock(&ref->callback_lock); + err = ref->active(ref); + mutex_unlock(&ref->callback_lock); + } else { + err = ref->active(ref); + } + } if (!err) { spin_lock_irq(&ref->tree_lock); /* __active_retire() */ debug_active_activate(ref); diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h index 6360c3e4b765..19b06c64c017 100644 --- a/drivers/gpu/drm/i915/i915_active_types.h +++ b/drivers/gpu/drm/i915/i915_active_types.h @@ -24,11 +24,6 @@ struct i915_active_fence {
struct active_node;
-#define I915_ACTIVE_MAY_SLEEP BIT(0) - -#define __i915_active_call __aligned(4) -#define i915_active_may_sleep(fn) ptr_pack_bits(&(fn), I915_ACTIVE_MAY_SLEEP, 2) - struct i915_active { atomic_t count; struct mutex mutex; @@ -45,6 +40,7 @@ struct i915_active {
int (*active)(struct i915_active *ref); void (*retire)(struct i915_active *ref); + struct mutex callback_lock;
struct work_struct work;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c index 08699fa069aa..0f73beb586ab 100644 --- a/drivers/gpu/drm/i915/i915_vma.c +++ b/drivers/gpu/drm/i915/i915_vma.c @@ -93,7 +93,6 @@ static int __i915_vma_active(struct i915_active *ref) return i915_vma_tryget(active_to_vma(ref)) ? 0 : -ENOENT; }
-__i915_active_call static void __i915_vma_retire(struct i915_active *ref) { i915_vma_put(active_to_vma(ref));
On Fri, Apr 03, 2020 at 03:35:15PM -0700, Sultan Alsawaf wrote:
ref->retire(ref);
mutex_unlock(&ref->callback_lock);
Ugh, this patch is still wrong because the mutex unlock after ref->retire() is a use-after-free. Fun times...
Sultan
From: Sultan Alsawaf sultan@kerneltoast.com
Active and retire callbacks can run simultaneously, causing panics and mayhem. The most notable case is with the intel_context_pin/unpin race that causes ring and page table corruption. In 5.4, this race is more noticeable because intel_ring_unpin() sets ring->vaddr to NULL and causes a clean NULL-pointer-dereference panic, but in newer kernels this race goes unnoticed.
Here is an example of a crash caused by this race on 5.4: BUG: unable to handle page fault for address: 0000000000003448 RIP: 0010:gen8_emit_flush_render+0x163/0x190 Call Trace: execlists_request_alloc+0x25/0x40 __i915_request_create+0x1f4/0x2c0 i915_request_create+0x71/0xc0 i915_gem_do_execbuffer+0xb98/0x1a80 ? preempt_count_add+0x68/0xa0 ? _raw_spin_lock+0x13/0x30 ? _raw_spin_unlock+0x16/0x30 i915_gem_execbuffer2_ioctl+0x1de/0x3c0 ? i915_gem_busy_ioctl+0x7f/0x1d0 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 drm_ioctl_kernel+0xb2/0x100 drm_ioctl+0x209/0x360 ? i915_gem_execbuffer_ioctl+0x2d0/0x2d0 ksys_ioctl+0x87/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x4e/0x150 entry_SYSCALL_64_after_hwframe+0x44/0xa9
Protect the active and retire callbacks with their own lock to prevent them from running at the same time as one another.
Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback") Cc: stable@vger.kernel.org Signed-off-by: Sultan Alsawaf sultan@kerneltoast.com --- drivers/gpu/drm/i915/i915_active.c | 52 ++++++++++++++++++++---- drivers/gpu/drm/i915/i915_active.h | 10 ++--- drivers/gpu/drm/i915/i915_active_types.h | 2 + 3 files changed, 50 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_active.c b/drivers/gpu/drm/i915/i915_active.c index b0a499753526..5802233f71ec 100644 --- a/drivers/gpu/drm/i915/i915_active.c +++ b/drivers/gpu/drm/i915/i915_active.c @@ -147,8 +147,22 @@ __active_retire(struct i915_active *ref) spin_unlock_irqrestore(&ref->tree_lock, flags);
/* After the final retire, the entire struct may be freed */ - if (ref->retire) - ref->retire(ref); + if (ref->retire) { + if (ref->active) { + bool freed = false; + + /* Don't race with the active callback, and avoid UaF */ + down_write(&ref->rwsem); + ref->freed = &freed; + ref->retire(ref); + if (!freed) { + ref->freed = NULL; + up_write(&ref->rwsem); + } + } else { + ref->retire(ref); + } + }
/* ... except if you wait on it, you must manage your own references! */ wake_up_var(ref); @@ -278,7 +292,8 @@ void __i915_active_init(struct i915_active *ref, int (*active)(struct i915_active *ref), void (*retire)(struct i915_active *ref), struct lock_class_key *mkey, - struct lock_class_key *wkey) + struct lock_class_key *wkey, + struct lock_class_key *rkey) { unsigned long bits;
@@ -287,8 +302,13 @@ void __i915_active_init(struct i915_active *ref, ref->flags = 0; ref->active = active; ref->retire = ptr_unpack_bits(retire, &bits, 2); - if (bits & I915_ACTIVE_MAY_SLEEP) + ref->freed = NULL; + if (ref->active && ref->retire) { + __init_rwsem(&ref->rwsem, "i915_active.rwsem", rkey); ref->flags |= I915_ACTIVE_RETIRE_SLEEPS; + } else if (bits & I915_ACTIVE_MAY_SLEEP) { + ref->flags |= I915_ACTIVE_RETIRE_SLEEPS; + }
spin_lock_init(&ref->tree_lock); ref->tree = RB_ROOT; @@ -417,8 +437,20 @@ int i915_active_acquire(struct i915_active *ref) return err;
if (likely(!i915_active_acquire_if_busy(ref))) { - if (ref->active) - err = ref->active(ref); + if (ref->active) { + if (ref->retire) { + /* + * This can be a recursive call, and the mutex + * above already protects from concurrent active + * callbacks, so a read lock fits best. + */ + down_read(&ref->rwsem); + err = ref->active(ref); + up_read(&ref->rwsem); + } else { + err = ref->active(ref); + } + } if (!err) { spin_lock_irq(&ref->tree_lock); /* __active_retire() */ debug_active_activate(ref); @@ -502,16 +534,20 @@ int i915_request_await_active(struct i915_request *rq, struct i915_active *ref) return err; }
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref) { + if (ref->freed) { + *ref->freed = true; + up_write(&ref->rwsem); + } +#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) debug_active_fini(ref); GEM_BUG_ON(atomic_read(&ref->count)); GEM_BUG_ON(work_pending(&ref->work)); GEM_BUG_ON(!RB_EMPTY_ROOT(&ref->tree)); mutex_destroy(&ref->mutex); -} #endif +}
static inline bool is_idle_barrier(struct active_node *node, u64 idx) { diff --git a/drivers/gpu/drm/i915/i915_active.h b/drivers/gpu/drm/i915/i915_active.h index 51e1e854ca55..b684b1fdcc02 100644 --- a/drivers/gpu/drm/i915/i915_active.h +++ b/drivers/gpu/drm/i915/i915_active.h @@ -153,14 +153,16 @@ void __i915_active_init(struct i915_active *ref, int (*active)(struct i915_active *ref), void (*retire)(struct i915_active *ref), struct lock_class_key *mkey, - struct lock_class_key *wkey); + struct lock_class_key *wkey, + struct lock_class_key *rkey);
/* Specialise each class of i915_active to avoid impossible lockdep cycles. */ #define i915_active_init(ref, active, retire) do { \ static struct lock_class_key __mkey; \ static struct lock_class_key __wkey; \ + static struct lock_class_key __rkey; \ \ - __i915_active_init(ref, active, retire, &__mkey, &__wkey); \ + __i915_active_init(ref, active, retire, &__mkey, &__wkey, &__rkey); \ } while (0)
int i915_active_ref(struct i915_active *ref, @@ -200,11 +202,7 @@ i915_active_is_idle(const struct i915_active *ref) return !atomic_read(&ref->count); }
-#if IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM) void i915_active_fini(struct i915_active *ref); -#else -static inline void i915_active_fini(struct i915_active *ref) { } -#endif
int i915_active_acquire_preallocate_barrier(struct i915_active *ref, struct intel_engine_cs *engine); diff --git a/drivers/gpu/drm/i915/i915_active_types.h b/drivers/gpu/drm/i915/i915_active_types.h index 6360c3e4b765..aaee2548cb19 100644 --- a/drivers/gpu/drm/i915/i915_active_types.h +++ b/drivers/gpu/drm/i915/i915_active_types.h @@ -32,6 +32,8 @@ struct active_node; struct i915_active { atomic_t count; struct mutex mutex; + struct rw_semaphore rwsem; + bool *freed;
spinlock_t tree_lock; struct active_node *cache;
Chris,
Could you please take a look at this? This really is quite an important fix.
Thanks, Sultan
Quoting Sultan Alsawaf (2020-04-14 07:13:12)
Chris,
Could you please take a look at this? This really is quite an important fix.
It's crazy. See a266bf420060 for a patch that should be applied to v5.4 -Chris
On Tue, Apr 14, 2020 at 09:23:56AM +0100, Chris Wilson wrote:
Quoting Sultan Alsawaf (2020-04-14 07:13:12)
Chris,
Could you please take a look at this? This really is quite an important fix.
It's crazy. See a266bf420060 for a patch that should be applied to v5.4 -Chris
What? a266bf420060 was part of 5.4.0-rc7, so it's already in 5.4. And if you read the commit message, you would see that the problem in question affects Linus' tree.
You can break i915 in 5.6 by just adding a small delay:
diff --git a/drivers/gpu/drm/i915/gt/intel_ring.c b/drivers/gpu/drm/i915/gt/intel_ring.c index 6ff803f397c4..3a7968effdfd 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring.c +++ b/drivers/gpu/drm/i915/gt/intel_ring.c @@ -10,6 +10,7 @@ #include "intel_engine.h" #include "intel_ring.h" #include "intel_timeline.h" +#include <linux/delay.h>
unsigned int intel_ring_update_space(struct intel_ring *ring) { @@ -92,6 +93,9 @@ void intel_ring_unpin(struct intel_ring *ring) else i915_gem_object_unpin_map(vma->obj);
+ mdelay(1); + ring->vaddr = NULL; + i915_vma_make_purgeable(vma); i915_vma_unpin(vma); }
This is how I reproduced the race in question. I can't even reach the greeter on my laptop with this, because i915 dies before that.
Sultan
Chris,
Could you please look at this in earnest? This is a real bug that crashes my laptop without any kind of provocation. It is undeniably a bug in i915, and I've clearly described it in my patch. If you dont like the patch, I'm open to any suggestions you have for an alternative solution. My goal here is to make i915 better, but it's difficult when communication only goes one way.
Thanks, Sultan
Quoting Sultan Alsawaf (2020-04-20 08:24:19)
Chris,
Could you please look at this in earnest? This is a real bug that crashes my laptop without any kind of provocation. It is undeniably a bug in i915, and I've clearly described it in my patch. If you dont like the patch, I'm open to any suggestions you have for an alternative solution. My goal here is to make i915 better, but it's difficult when communication only goes one way.
Hi Sultan,
The patch Chris pointed out was not part of 5.4 release. The commit message describes that it fixes the functions to be tolerant to running simultaneously. In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1) and "ring->vaddr = NULL;" is not correct.
I think you might have used the wrong git command for checking the patch history:
$ git describe a266bf420060 v5.4-rc7-1996-ga266bf420060 # after -rc7 tag
$ git describe --contains a266bf420060 v5.6-rc1~34^2~21^2~326 # included in v5.6-rc1
And git log to double check:
$ git log --format=oneline kernel.org/stable/linux-5.4.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" $ git log --format=oneline kernel.org/stable/linux-5.5.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint $ git log --format=oneline kernel.org/stable/linux-5.6.y --grep="drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint" a754012b9f2323a5d640da7eb7b095ac3b8cd012 drm/i915/execlists: Leave resetting ring to intel_ring 0725d9a31869e6c80630e99da366ede2848295cc drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint a266bf42006004306dd48a9082c35dfbff153307 drm/i915/gt: Make intel_ring_unpin() safe for concurrent pint
So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4.
Could you try applying the patch to 5.4 and seeing if the problem persists?
Regards, Joonas
On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4.
You're right, that's my mistake.
In doing that zeroing of ring->vaddr is removed so the test to do mdelay(1) and "ring->vaddr = NULL;" is not correct.
I'm not so sure about this. Look at where `ring->vaddr` is assigned: -------------------------------------8<----------------------------------------- ret = i915_vma_pin(vma, 0, 0, flags); if (unlikely(ret)) goto err_unpin;
if (i915_vma_is_map_and_fenceable(vma)) addr = (void __force *)i915_vma_pin_iomap(vma); else addr = i915_gem_object_pin_map(vma->obj, i915_coherent_map_type(vma->vm->i915)); if (IS_ERR(addr)) { ret = PTR_ERR(addr); goto err_ring; }
i915_vma_make_unshrinkable(vma);
/* Discard any unused bytes beyond that submitted to hw. */ intel_ring_reset(ring, ring->emit);
ring->vaddr = addr; ------------------------------------->8-----------------------------------------
And then the converse of that is done *before* my reproducer patch does `ring->vaddr = NULL;`: -------------------------------------8<----------------------------------------- i915_vma_unset_ggtt_write(vma); if (i915_vma_is_map_and_fenceable(vma)) i915_vma_unpin_iomap(vma); else i915_gem_object_unpin_map(vma->obj);
/* mdelay(1); ring->vaddr = NULL; */
i915_vma_make_purgeable(vma); i915_vma_unpin(vma); ------------------------------------->8-----------------------------------------
Isn't the value assigned to `ring->vaddr` trashed by those function calls above where I've got the mdelay? If so, why would it be correct to let the stale value sit in `ring->vaddr`?
My interpretation of the zeroing of ring->vaddr being removed by Chris was that it was an unnecessary step before the ring was getting discarded anyway.
Sultan
Quoting Sultan Alsawaf (2020-04-20 19:15:14)
On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4.
You're right, that's my mistake.
Did applying the patch to v5.4 fix the issue at hand?
Regards, Joonas
On Tue, Apr 21, 2020 at 09:51:37AM +0300, Joonas Lahtinen wrote:
Quoting Sultan Alsawaf (2020-04-20 19:15:14)
On Mon, Apr 20, 2020 at 11:21:42AM +0300, Joonas Lahtinen wrote:
So it seems that the patch got pulled into v5.6 and has been backported to v5.5 but not v5.4.
You're right, that's my mistake.
Did applying the patch to v5.4 fix the issue at hand?
Of course the patch doesn't apply as-is because the code's been shuffled around, but yes. The crashes are gone with that patch, and I don't have the motivation to check if that patch is actually correct, so hurray, problem solved. I'm not going to send the backport myself because I'll probably be ignored, so you can go ahead and do that.
Sultan
Greeting,
FYI, we noticed the following commit (built with gcc-7):
commit: 6dc0b234a64d2fdea96623381b234ec328b5a0a2 ("[PATCH] drm/i915: Fix use-after-free due to intel_context_pin/unpin race") url: https://github.com/0day-ci/linux/commits/Sultan-Alsawaf/drm-i915-Fix-use-aft... base: git://anongit.freedesktop.org/drm-intel for-linux-next
in testcase: suspend-stress with following parameters:
mode: freeze iterations: 10
on test machine: 4 threads BroadWell with 8G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag Reported-by: kernel test robot lkp@intel.com
kern :err : [ 209.039440] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:281 kern :err : [ 209.039594] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 34, name: kworker/3:1 kern :warn : [ 209.039709] CPU: 3 PID: 34 Comm: kworker/3:1 Not tainted 5.6.0-rc5-01501-g6dc0b234a64d2f #1 kern :warn : [ 209.039824] Hardware name: /NUC5i3RYB, BIOS RYBDWi35.86A.0363.2017.0316.1028 03/16/2017 kern :warn : [ 209.040023] Workqueue: events engine_retire [i915] kern :warn : [ 209.040093] Call Trace: kern :warn : [ 209.040140] dump_stack+0x66/0x8b kern :warn : [ 209.040192] ___might_sleep+0x102/0x120 kern :warn : [ 209.040251] mutex_lock+0x1c/0x40 kern :warn : [ 209.040380] __active_retire+0x7f/0x110 [i915] kern :warn : [ 209.040449] dma_fence_signal_locked+0x7e/0x100 kern :warn : [ 209.040595] i915_request_retire+0x315/0x370 [i915] kern :warn : [ 209.040736] retire_requests+0x4e/0x70 [i915] kern :warn : [ 209.040865] engine_retire+0x61/0x90 [i915] kern :warn : [ 209.040930] process_one_work+0x1b0/0x3e0 kern :warn : [ 209.040990] ? move_linked_works+0x6e/0xa0 kern :warn : [ 209.041051] worker_thread+0x1e5/0x3b0 kern :warn : [ 209.041108] ? process_one_work+0x3e0/0x3e0 kern :warn : [ 209.041170] kthread+0x11e/0x140 kern :warn : [ 209.041220] ? kthread_park+0x90/0x90 kern :warn : [ 209.041277] ret_from_fork+0x35/0x40 kern :debug : [ 209.045034] calling coretemp_init+0x0/0x1000 [coretemp] @ 245 kern :debug : [ 209.045252] probe of coretemp.0 returned 1 after 44 usecs kern :debug : [ 209.068661] initcall coretemp_init+0x0/0x1000 [coretemp] returned 0 after 22978 usecs kern :debug : [ 209.071902] calling powerclamp_init+0x0/0x1000 [intel_powerclamp] @ 240 kern :debug : [ 209.078262] initcall powerclamp_init+0x0/0x1000 [intel_powerclamp] returned 0 after 6104 usecs kern :info : [ 209.079857] [drm] Initialized i915 1.6.0 20200313 for 0000:00:02.0 on minor 0 kern :debug : [ 209.081471] calling pkg_temp_thermal_init+0x0/0x1000 [x86_pkg_temp_thermal] @ 240 kern :debug : [ 209.081729] initcall pkg_temp_thermal_init+0x0/0x1000 [x86_pkg_temp_thermal] returned 0 after 138 usecs kern :info : [ 209.083553] ACPI: Video Device [GFX0] (multi-head: yes rom: no post: no) kern :info : [ 209.085400] input: Video Bus as /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input9 kern :debug : [ 209.085534] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 240 kern :debug : [ 209.085540] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs kern :debug : [ 209.086704] probe of LNXVIDEO:00 returned 1 after 6197 usecs kern :info : [ 209.087484] snd_hda_intel 0000:00:03.0: bound 0000:00:02.0 (ops i915_audio_component_bind_ops [i915]) kern :debug : [ 209.087831] probe of 0000:00:02.0 returned 1 after 161586 usecs kern :debug : [ 209.088502] initcall i915_init+0x0/0x6b [i915] returned 0 after 2820 usecs kern :debug : [ 209.106252] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 245 kern :debug : [ 209.106354] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs kern :debug : [ 209.108121] calling rapl_init+0x0/0x1000 [intel_rapl_common] @ 240 kern :debug : [ 209.108266] initcall rapl_init+0x0/0x1000 [intel_rapl_common] returned 0 after 47 usecs kern :debug : [ 209.111832] calling intel_rapl_msr_driver_init+0x0/0x1000 [intel_rapl_msr] @ 242 kern :info : [ 209.112018] intel_rapl_common: Found RAPL domain package kern :info : [ 209.112097] intel_rapl_common: Found RAPL domain core kern :info : [ 209.112171] intel_rapl_common: Found RAPL domain uncore kern :info : [ 209.112246] intel_rapl_common: Found RAPL domain dram kern :debug : [ 209.120124] probe of intel_rapl_msr.0 returned 1 after 8156 usecs kern :debug : [ 209.120247] initcall intel_rapl_msr_driver_init+0x0/0x1000 [intel_rapl_msr] returned 0 after 8102 usecs kern :debug : [ 209.127972] calling hdmi_driver_init+0x0/0x1000 [snd_hda_codec_hdmi] @ 504 kern :debug : [ 209.128244] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 266 kern :debug : [ 209.128343] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 1 usecs kern :debug : [ 209.130475] probe of hdaudioC0D0 returned 1 after 2387 usecs kern :debug : [ 209.130596] initcall hdmi_driver_init+0x0/0x1000 [snd_hda_codec_hdmi] returned 0 after 2194 usecs kern :info : [ 209.131772] input: HDA Intel HDMI HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:03.0/sound/card0/input10 kern :info : [ 209.131985] input: HDA Intel HDMI HDMI/DP,pcm=7 as /devices/pci0000:00/0000:00:03.0/sound/card0/input11 kern :info : [ 209.132183] input: HDA Intel HDMI HDMI/DP,pcm=8 as /devices/pci0000:00/0000:00:03.0/sound/card0/input12 kern :info : [ 209.132377] input: HDA Intel HDMI HDMI/DP,pcm=9 as /devices/pci0000:00/0000:00:03.0/sound/card0/input13 kern :info : [ 209.132574] input: HDA Intel HDMI HDMI/DP,pcm=10 as /devices/pci0000:00/0000:00:03.0/sound/card0/input14 kern :debug : [ 209.147193] calling acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] @ 247 kern :debug : [ 209.147298] initcall acpi_cpufreq_init+0x0/0x1000 [acpi_cpufreq] returned -17 after 0 usecs kern :info : [ 209.237164] fbcon: i915drmfb (fb0) is primary device kern :info : [ 209.277039] Console: switching to colour frame buffer device 240x67 kern :info : [ 209.302577] i915 0000:00:02.0: fb0: i915drmfb frame buffer device kern :err : [ 215.244354] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back kern :err : [ 215.244986] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.245244] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.245796] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.246027] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.246523] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.246768] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.247262] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.247490] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.247979] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.248210] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.248642] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back kern :err : [ 215.249166] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.249398] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.249893] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.250123] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.250638] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.250883] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.251361] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.251589] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.252083] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.252312] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.252744] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2ee2000-0xa2ee2fff], got write-back kern :err : [ 215.253290] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.253521] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.254018] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.254251] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.254762] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.254988] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.255503] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.255748] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.256239] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back kern :err : [ 215.256470] x86/PAT: bmc-watchdog:596 map pfn expected mapping type uncached-minus for [mem 0xa2352000-0xa2352fff], got write-back user :notice: [ 215.849565] Kernel tests: Boot OK!
kern :info : [ 217.609110] PM: suspend entry (s2idle) kern :info : [ 217.609211] Filesystems sync: 0.000 seconds kern :info : [ 217.631728] Freezing user space processes ... (elapsed 0.000 seconds) done.
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp install job.yaml # job file is attached in this email bin/lkp run job.yaml
Thanks, lkp
linux-stable-mirror@lists.linaro.org