5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Torvalds torvalds@linux-foundation.org
commit 3628e0383dd349f02f882e612ab6184e4bb3dc10 upstream.
This reverts commit 07ed11afb68d94eadd4ffc082b97c2331307c5ea.
Stephen Rostedt reports: "I went to run my tests on my VMs and the tests hung on boot up. Unfortunately, the most I ever got out was:
[ 93.607888] Testing event system initcall: OK [ 93.667730] Running tests on all trace events: [ 93.669757] Testing all events: OK [ 95.631064] ------------[ cut here ]------------ Timed out after 60 seconds"
and further debugging points to a possible circular locking dependency between the console_owner locking and the worker pool locking.
Reverting the commit allows Steve's VM to boot to completion again.
[ This may obviously result in the "[TTM] Buffer eviction failed" messages again, which was the reason for that original revert. But at this point this seems preferable to a non-booting system... ]
Reported-and-bisected-by: Steven Rostedt rostedt@goodmis.org Link: https://lore.kernel.org/all/20240502081641.457aa25f@gandalf.local.home/ Acked-by: Maxime Ripard mripard@kernel.org Cc: Alex Constantino dreaming.about.electric.sheep@gmail.com Cc: Maxime Ripard mripard@kernel.org Cc: Timo Lindfors timo.lindfors@iki.fi Cc: Dave Airlie airlied@redhat.com Cc: Gerd Hoffmann kraxel@redhat.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Thomas Zimmermann tzimmermann@suse.de Cc: Daniel Vetter daniel@ffwll.ch Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/qxl/qxl_release.c | 50 +++----------------------------------- include/linux/dma-fence.h | 7 ----- 2 files changed, 5 insertions(+), 52 deletions(-)
--- a/drivers/gpu/drm/qxl/qxl_release.c +++ b/drivers/gpu/drm/qxl/qxl_release.c @@ -58,56 +58,16 @@ static long qxl_fence_wait(struct dma_fe signed long timeout) { struct qxl_device *qdev; - struct qxl_release *release; - int count = 0, sc = 0; - bool have_drawable_releases; unsigned long cur, end = jiffies + timeout;
qdev = container_of(fence->lock, struct qxl_device, release_lock); - release = container_of(fence, struct qxl_release, base); - have_drawable_releases = release->type == QXL_RELEASE_DRAWABLE;
-retry: - sc++; - - if (dma_fence_is_signaled(fence)) - goto signaled; - - qxl_io_notify_oom(qdev); - - for (count = 0; count < 11; count++) { - if (!qxl_queue_garbage_collect(qdev, true)) - break; - - if (dma_fence_is_signaled(fence)) - goto signaled; - } - - if (dma_fence_is_signaled(fence)) - goto signaled; - - if (have_drawable_releases || sc < 4) { - if (sc > 2) - /* back off */ - usleep_range(500, 1000); - - if (time_after(jiffies, end)) - return 0; - - if (have_drawable_releases && sc > 300) { - DMA_FENCE_WARN(fence, - "failed to wait on release %llu after spincount %d\n", - fence->context & ~0xf0000000, sc); - goto signaled; - } - goto retry; - } - /* - * yeah, original sync_obj_wait gave up after 3 spins when - * have_drawable_releases is not set. - */ + if (!wait_event_timeout(qdev->release_event, + (dma_fence_is_signaled(fence) || + (qxl_io_notify_oom(qdev), 0)), + timeout)) + return 0;
-signaled: cur = jiffies; if (time_after(cur, end)) return 0; --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -631,11 +631,4 @@ u64 dma_fence_context_alloc(unsigned num ##args); \ } while (0)
-#define DMA_FENCE_WARN(f, fmt, args...) \ - do { \ - struct dma_fence *__ff = (f); \ - pr_warn("f %llu#%llu: " fmt, __ff->context, __ff->seqno,\ - ##args); \ - } while (0) - #endif /* __LINUX_DMA_FENCE_H */