This is a note to let you know that I've just added the patch titled
net/mlx5: Fix health work queue spin lock to IRQ safe
to the 4.13-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: net-mlx5-fix-health-work-queue-spin-lock-to-irq-safe.patch and it can be found in the queue-4.13 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
From foo@baz Wed Nov 15 17:25:34 CET 2017
From: Moshe Shemesh moshe@mellanox.com Date: Thu, 19 Oct 2017 14:14:29 +0300 Subject: net/mlx5: Fix health work queue spin lock to IRQ safe
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit 6377ed0bbae6fa28853e1679d068a9106c8a8908 ]
spin_lock/unlock of health->wq_lock should be IRQ safe. It was changed to spin_lock_irqsave since adding commit 0179720d6be2 ("net/mlx5: Introduce trigger_health_work function") which uses spin_lock from asynchronous event (IRQ) context. Thus, all spin_lock/unlock of health->wq_lock should have been moved to IRQ safe mode. However, one occurrence on new code using this lock missed that change, resulting in possible deadlock: kernel: Possible unsafe locking scenario: kernel: CPU0 kernel: ---- kernel: lock(&(&health->wq_lock)->rlock); kernel: <Interrupt> kernel: lock(&(&health->wq_lock)->rlock); kernel: #012 *** DEADLOCK ***
Fixes: 2a0165a034ac ("net/mlx5: Cancel delayed recovery work when unloading the driver") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/health.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c @@ -356,10 +356,11 @@ void mlx5_drain_health_wq(struct mlx5_co void mlx5_drain_health_recovery(struct mlx5_core_dev *dev) { struct mlx5_core_health *health = &dev->priv.health; + unsigned long flags;
- spin_lock(&health->wq_lock); + spin_lock_irqsave(&health->wq_lock, flags); set_bit(MLX5_DROP_NEW_RECOVERY_WORK, &health->flags); - spin_unlock(&health->wq_lock); + spin_unlock_irqrestore(&health->wq_lock, flags); cancel_delayed_work_sync(&dev->priv.health.recover_work); }
Patches currently in stable-queue which might be from moshe@mellanox.com are
queue-4.13/net-mlx5-fix-health-work-queue-spin-lock-to-irq-safe.patch
linux-stable-mirror@lists.linaro.org