From: Ido Schimmel idosch@nvidia.com
[ Upstream commit e21c52d7814f5768f05224e773644629fe124af2 ]
Device drivers register with devlink from their probe routines (under the device lock) by acquiring the devlink instance lock and calling devl_register().
Drivers that support a devlink reload usually implement the reload_{down, up}() operations in a similar fashion to their remove and probe routines, respectively.
However, while the remove and probe routines are invoked with the device lock held, the reload operations are only invoked with the devlink instance lock held. It is therefore impossible for drivers to acquire the device lock from their reload operations, as this would result in lock inversion.
The motivating use case for invoking the reload operations with the device lock held is in mlxsw which needs to trigger a PCI reset as part of the reload. The driver cannot call pci_reset_function() as this function acquires the device lock. Instead, it needs to call __pci_reset_function_locked which expects the device lock to be held.
To that end, adjust devlink to always acquire the device lock before the devlink instance lock when performing a reload.
For now, only do that when reload is triggered as part of netns dismantle. Subsequent patches will handle the case where reload is explicitly triggered by user space.
Signed-off-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Petr Machata petrm@nvidia.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: d7d75124965a ("devlink: Fix devlink parallel commands processing") Signed-off-by: Sasha Levin sashal@kernel.org --- net/devlink/core.c | 4 ++-- net/devlink/devl_internal.h | 15 +++++++++++++++ 2 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/net/devlink/core.c b/net/devlink/core.c index bc3d265fe2d6e..7f0b093208d75 100644 --- a/net/devlink/core.c +++ b/net/devlink/core.c @@ -503,14 +503,14 @@ static void __net_exit devlink_pernet_pre_exit(struct net *net) * all devlink instances from this namespace into init_net. */ devlinks_xa_for_each_registered_get(net, index, devlink) { - devl_lock(devlink); + devl_dev_lock(devlink, true); err = 0; if (devl_is_registered(devlink)) err = devlink_reload(devlink, &init_net, DEVLINK_RELOAD_ACTION_DRIVER_REINIT, DEVLINK_RELOAD_LIMIT_UNSPEC, &actions_performed, NULL); - devl_unlock(devlink); + devl_dev_unlock(devlink, true); devlink_put(devlink); if (err && err != -EOPNOTSUPP) pr_warn("Failed to reload devlink instance into init_net\n"); diff --git a/net/devlink/devl_internal.h b/net/devlink/devl_internal.h index 2a9b263300a4b..178abaf74a107 100644 --- a/net/devlink/devl_internal.h +++ b/net/devlink/devl_internal.h @@ -3,6 +3,7 @@ * Copyright (c) 2016 Jiri Pirko jiri@mellanox.com */
+#include <linux/device.h> #include <linux/etherdevice.h> #include <linux/mutex.h> #include <linux/netdevice.h> @@ -96,6 +97,20 @@ static inline bool devl_is_registered(struct devlink *devlink) return xa_get_mark(&devlinks, devlink->index, DEVLINK_REGISTERED); }
+static inline void devl_dev_lock(struct devlink *devlink, bool dev_lock) +{ + if (dev_lock) + device_lock(devlink->dev); + devl_lock(devlink); +} + +static inline void devl_dev_unlock(struct devlink *devlink, bool dev_lock) +{ + devl_unlock(devlink); + if (dev_lock) + device_unlock(devlink->dev); +} + typedef void devlink_rel_notify_cb_t(struct devlink *devlink, u32 obj_index); typedef void devlink_rel_cleanup_cb_t(struct devlink *devlink, u32 obj_index, u32 rel_index);