On Thu, May 22, 2025 at 03:13:46PM -0700, Jakub Kicinski wrote:
On Wed, 21 May 2025 03:25:03 -0700 Saurabh Sengar wrote:
The MANA driver's probe registers netdevice via the following call chain:
mana_probe() register_netdev() register_netdevice()
register_netdevice() calls notifier callback for netvsc driver, holding the netdev mutex via netdev_lock_ops().
Further this netvsc notifier callback end up attempting to acquire the same lock again in dev_xdp_propagate() leading to deadlock.
netvsc_netdev_event() netvsc_vf_setxdp() dev_xdp_propagate()
This deadlock was not observed so far because net_shaper_ops was never set,
The lock is on the VF, I think you meant to say that no device you use in Azure is ops locked?
There's also the call to netvsc_register_vf() on probe path, please fix or explain why it doesn't need locking in the commit message.
This patch specifically addresses the netvsc_register_vf() path only. I omitted the mention of netvsc_register_vf() in the commit message to keep the function path shorter. The full stack trace is provided below:
[ 92.542180] dev_xdp_propagate+0x2c/0x1b0 [ 92.542185] netvsc_vf_setxdp+0x10d/0x180 [hv_netvsc] [ 92.542192] netvsc_register_vf.part.0+0x179/0x200 [hv_netvsc] [ 92.542196] netvsc_netdev_event+0x267/0x340 [hv_netvsc] [ 92.542200] notifier_call_chain+0x5f/0xc0 [ 92.542203] raw_notifier_call_chain+0x16/0x20 [ 92.542205] call_netdevice_notifiers_info+0x52/0xa0 [ 92.542209] register_netdevice+0x7c8/0xaa0 [ 92.542211] register_netdev+0x1f/0x40 [ 92.542214] mana_probe+0x6e2/0x8e0 [mana] [ 92.542220] mana_gd_probe+0x187/0x220 [mana]
If you prefer I can update the stack trace in commit meesage From:
netvsc_netdev_event() netvsc_vf_setxdp() dev_xdp_propagate()
To:
netvsc_netdev_event() netvsc_register_vf() netvsc_vf_setxdp() dev_xdp_propagate()
- Saurabh
-- pw-bot: cr