From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 77e45145e3039a0fb212556ab3f8c87f54771757 ]
napi_schedule() is expected to be called either:
* From an interrupt, where raised softirqs are handled on IRQ exit
* From a softirq disabled section, where raised softirqs are handled on the next call to local_bh_enable().
* From a softirq handler, where raised softirqs are handled on the next round in do_softirq(), or further deferred to a dedicated kthread.
Other bare tasks context may end up ignoring the raised NET_RX vector until the next random softirq handling opportunity, which may not happen before a while if the CPU goes idle afterwards with the tick stopped.
Such "misuses" have been detected on several places thanks to messages of the kind:
"NOHZ tick-stop error: local softirq work is pending, handler #08!!!"
For example:
__raise_softirq_irqoff __napi_schedule rtl8152_runtime_resume.isra.0 rtl8152_resume usb_resume_interface.isra.0 usb_resume_both __rpm_callback rpm_callback rpm_resume __pm_runtime_resume usb_autoresume_device usb_remote_wakeup hub_event process_one_work worker_thread kthread ret_from_fork ret_from_fork_asm
And also:
* drivers/net/usb/r8152.c::rtl_work_func_t * drivers/net/netdevsim/netdev.c::nsim_start_xmit
There is a long history of issues of this kind:
019edd01d174 ("ath10k: sdio: Add missing BH locking around napi_schdule()") 330068589389 ("idpf: disable local BH when scheduling napi for marker packets") e3d5d70cb483 ("net: lan78xx: fix "softirq work is pending" error") e55c27ed9ccf ("mt76: mt7615: add missing bh-disable around rx napi schedule") c0182aa98570 ("mt76: mt7915: add missing bh-disable around tx napi enable/schedule") 970be1dff26d ("mt76: disable BH around napi_schedule() calls") 019edd01d174 ("ath10k: sdio: Add missing BH locking around napi_schdule()") 30bfec4fec59 ("can: rx-offload: can_rx_offload_threaded_irq_finish(): add new function to be called from threaded interrupt") e63052a5dd3c ("mlx5e: add add missing BH locking around napi_schdule()") 83a0c6e58901 ("i40e: Invoke softirqs after napi_reschedule") bd4ce941c8d5 ("mlx4: Invoke softirqs after napi_reschedule") 8cf699ec849f ("mlx4: do not call napi_schedule() without care") ec13ee80145c ("virtio_net: invoke softirqs after __napi_schedule")
This shows that relying on the caller to arrange a proper context for the softirqs to be handled while calling napi_schedule() is very fragile and error prone. Also fixing them can also prove challenging if the caller may be called from different kinds of contexts.
Therefore fix this from napi_schedule() itself with waking up ksoftirqd when softirqs are raised from task contexts.
Reported-by: Paul Menzel pmenzel@molgen.mpg.de Reported-by: Jakub Kicinski kuba@kernel.org Reported-by: Francois Romieu romieu@fr.zoreil.com Closes: https://lore.kernel.org/lkml/354a2690-9bbf-4ccb-8769-fa94707a9340@molgen.mpg... Cc: Breno Leitao leitao@debian.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20250223221708.27130-1-frederic@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/core/dev.c b/net/core/dev.c index c761f862bc5a2..d286e400c0d87 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4608,7 +4608,7 @@ static inline void ____napi_schedule(struct softnet_data *sd, * we have to raise NET_RX_SOFTIRQ. */ if (!sd->in_net_rx_action) - __raise_softirq_irqoff(NET_RX_SOFTIRQ); + raise_softirq_irqoff(NET_RX_SOFTIRQ); }
#ifdef CONFIG_RPS