On 19.09.25 12:44, Janusz Krzysztofik wrote:
When multiple fences of an sw_sync timeline are signaled via sw_sync_ioctl_inc(), we now disable interrupts and keep them disabled while signaling all requested fences of the timeline in a loop. Since user space may set up an arbitrary long timeline of fences with arbitrarily expensive callbacks added to each fence, we may end up running with interrupts disabled for too long, longer than NMI watchdog limit. That potentially risky scenario has been demonstrated on Intel DRM CI trybot[1], on a low end machine fi-pnv-d510, with one of new IGT subtests that try to reimplement wait_* test cases of a dma_fence_chain selftest in user space.
The purpose of the sw-sync is to test what happens if drivers exposing dma-fences doesn't behave well.
So being able to trigger the NMI watchdog for example is part of why that functionality exists in the first place.
See this WARNING in the Kconfig file as well:
config SW_SYNC bool "Sync File Validation Framework" default n depends on SYNC_FILE depends on DEBUG_FS help A sync object driver that uses a 32bit counter to coordinate synchronization. Useful when there is no hardware primitive backing the synchronization.
WARNING: improper use of this can result in deadlocking kernel drivers from userspace. Intended for test and debug only.
You can actually use the functionality to intentionally deadlock drivers and even the core memory management.
Regards, Christian.
[141.993704] [IGT] syncobj_timeline: starting subtest stress-enable-all-signal-all-forward [164.964389] watchdog: CPU3: Watchdog detected hard LOCKUP on cpu 3 [164.964407] Modules linked in: snd_hda_codec_alc662 snd_hda_codec_realtek_lib snd_hda_codec_generic snd_hda_intel snd_intel_dspcfg snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore i915 prime_numbers ttm drm_buddy drm_display_helper cec rc_core i2c_algo_bit video wmi overlay at24 ppdev gpio_ich binfmt_misc nls_iso8859_1 coretemp i2c_i801 i2c_mux i2c_smbus r8169 lpc_ich realtek parport_pc parport nvme_fabrics dm_multipath fuse msr efi_pstore nfnetlink autofs4 [164.964569] irq event stamp: 1002206 [164.964575] hardirqs last enabled at (1002205): [<ffffffff82898ac7>] _raw_spin_unlock_irq+0x27/0x70 [164.964599] hardirqs last disabled at (1002206): [<ffffffff8287d021>] sysvec_irq_work+0x11/0xc0 [164.964616] softirqs last enabled at (1002138): [<ffffffff81341bc5>] fpu_clone+0xb5/0x270 [164.964631] softirqs last disabled at (1002136): [<ffffffff81341b97>] fpu_clone+0x87/0x270 [164.964650] CPU: 3 UID: 0 PID: 1515 Comm: syncobj_timelin Tainted: G U 6.17.0-rc6-Trybot_154715v1-gc1b827f32471+ #1 PREEMPT(voluntary) [164.964662] Tainted: [U]=USER [164.964665] Hardware name: /D510MO, BIOS MOPNV10J.86A.0311.2010.0802.2346 08/02/2010 [164.964669] RIP: 0010:lock_release+0x13d/0x2a0 [164.964680] Code: c2 01 48 8d 4d c8 44 89 f6 4c 89 ef e8 bc fc ff ff 0b 05 96 ca 42 06 0f 84 fc 00 00 00 b8 ff ff ff ff 65 0f c1 05 0b 71 a9 02 <83> f8 01 0f 85 2f 01 00 00 48 f7 45 c0 00 02 00 00 74 06 fb 0f 1f [164.964686] RSP: 0018:ffffc90000170e70 EFLAGS: 00000057 [164.964693] RAX: 0000000000000001 RBX: ffffffff83595520 RCX: 0000000000000000 [164.964698] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [164.964701] RBP: ffffc90000170eb0 R08: 0000000000000000 R09: 0000000000000000 [164.964706] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffff8226a948 [164.964710] R13: ffff88802423b340 R14: 0000000000000001 R15: ffff88802423c238 [164.964714] FS: 0000729f4d972940(0000) GS:ffff8880f8e77000(0000) knlGS:0000000000000000 [164.964720] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [164.964725] CR2: 0000729f4d92e720 CR3: 000000003afe4000 CR4: 00000000000006f0 [164.964729] Call Trace: [164.964734] <IRQ> [164.964750] dma_fence_chain_get_prev+0x13d/0x240 [164.964769] dma_fence_chain_walk+0xbd/0x200 [164.964784] dma_fence_chain_enable_signaling+0xb2/0x280 [164.964803] dma_fence_chain_irq_work+0x1b/0x80 [164.964816] irq_work_single+0x75/0xa0 [164.964834] irq_work_run_list+0x33/0x60 [164.964846] irq_work_run+0x18/0x40 [164.964856] __sysvec_irq_work+0x35/0x170 [164.964868] sysvec_irq_work+0x9b/0xc0 [164.964879] </IRQ> [164.964882] <TASK> [164.964890] asm_sysvec_irq_work+0x1b/0x20 [164.964900] RIP: 0010:_raw_spin_unlock_irq+0x2d/0x70 [164.964907] Code: 00 00 55 48 89 e5 53 48 89 fb 48 83 c7 18 48 8b 75 08 e8 06 63 bf fe 48 89 df e8 be 98 bf fe e8 59 ee d3 fe fb 0f 1f 44 00 00 <65> ff 0d 5c 85 68 01 74 14 48 8b 5d f8 c9 31 c0 31 d2 31 c9 31 f6 [164.964913] RSP: 0018:ffffc9000070fca0 EFLAGS: 00000246 [164.964919] RAX: 0000000000000000 RBX: ffff88800c2d8b10 RCX: 0000000000000000 [164.964923] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [164.964927] RBP: ffffc9000070fca8 R08: 0000000000000000 R09: 0000000000000000 [164.964931] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88800c2d8ac0 [164.964934] R13: ffffc9000070fcc8 R14: ffff88800c2d8ac0 R15: 00000000ffffffff [164.964967] sync_timeline_signal+0x153/0x2c0 [164.964989] sw_sync_ioctl+0x98/0x580 [164.965017] __x64_sys_ioctl+0xa2/0x100 [164.965034] x64_sys_call+0x1226/0x2680 [164.965046] do_syscall_64+0x93/0x980 [164.965057] ? do_syscall_64+0x1b7/0x980 [164.965070] ? lock_release+0xce/0x2a0 [164.965082] ? __might_fault+0x53/0xb0 [164.965096] ? __might_fault+0x89/0xb0 [164.965104] ? __might_fault+0x53/0xb0 [164.965116] ? _copy_to_user+0x53/0x70 [164.965131] ? __x64_sys_rt_sigprocmask+0x8f/0xe0 [164.965152] ? do_syscall_64+0x1b7/0x980 [164.965169] entry_SYSCALL_64_after_hwframe+0x76/0x7e [164.965176] RIP: 0033:0x729f4fb24ded [164.965188] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [164.965193] RSP: 002b:00007ffdc36220e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [164.965200] RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 0000729f4fb24ded [164.965205] RDX: 00007ffdc3622174 RSI: 0000000040045701 RDI: 0000000000000007 [164.965209] RBP: 00007ffdc3622130 R08: 0000000000000000 R09: 0000000000000000 [164.965213] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdc3622174 [164.965217] R13: 0000000040045701 R14: 0000000000000007 R15: 0000000000000003 [164.965248] </TASK> [166.952984] perf: interrupt took too long (11861 > 6217), lowering kernel.perf_event_max_sample_rate to 16000 [166.953134] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 13036276804 wd_nsec: 13036274445
Avoid potentially expensive signaling of each fence when removing it from the timeline from inside the loop under protection of a common lock and disabled interrupts, do that only after interrupts are re-enabled. Each call to dma_fence_signal() will then disable and re-enable interrputs as needed for processing of each signaled fence.
[1] https://patchwork.freedesktop.org/series/154715/
Fixes: 0f0d8406fb9c3 ("android: convert sync to fence api, v6") Signed-off-by: Janusz Krzysztofik janusz.krzysztofik@linux.intel.com
drivers/dma-buf/sw_sync.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c index 3c20f1d31cf54..638c2f756299a 100644 --- a/drivers/dma-buf/sw_sync.c +++ b/drivers/dma-buf/sw_sync.c @@ -224,13 +224,12 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc) list_move_tail(&pt->link, &signalled); rb_erase(&pt->node, &obj->pt_tree);
}dma_fence_signal_locked(&pt->base);
spin_unlock_irq(&obj->lock); list_for_each_entry_safe(pt, next, &signalled, link) {
list_del_init(&pt->link); dma_fence_put(&pt->base); }dma_fence_signal(&pt->base);