This is a note to let you know that I've just added the patch titled
net/mlx5: Cleanup IRQs in case of unload failure
to the 4.14-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git%3Ba=su...
The filename of the patch is: net-mlx5-cleanup-irqs-in-case-of-unload-failure.patch and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree, please let stable@vger.kernel.org know about it.
From foo@baz Wed Feb 28 16:23:28 CET 2018
From: Moshe Shemesh moshe@mellanox.com Date: Tue, 21 Nov 2017 15:15:51 +0200 Subject: net/mlx5: Cleanup IRQs in case of unload failure
From: Moshe Shemesh moshe@mellanox.com
[ Upstream commit d6b2785cd55ee72e9608762650b3ef299f801b1b ]
When mlx5_stop_eqs fails to destroy any of the eqs it returns with an error. In such failure flow the function will return without releasing all EQs irqs and then pci_free_irq_vectors will fail. Fix by only warn on destroy EQ failure and continue to release other EQs and their irqs.
It fixes the following kernel trace: kernel: kernel BUG at drivers/pci/msi.c:352! ... ... kernel: Call Trace: kernel: pci_disable_msix+0xd3/0x100 kernel: pci_free_irq_vectors+0xe/0x20 kernel: mlx5_load_one.isra.17+0x9f5/0xec0 [mlx5_core]
Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Signed-off-by: Moshe Shemesh moshe@mellanox.com Signed-off-by: Saeed Mahameed saeedm@mellanox.com Signed-off-by: Sasha Levin alexander.levin@verizon.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/mellanox/mlx5/core/eq.c | 20 +++++++++++++------- include/linux/mlx5/driver.h | 2 +- 2 files changed, 14 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c @@ -776,7 +776,7 @@ err1: return err; }
-int mlx5_stop_eqs(struct mlx5_core_dev *dev) +void mlx5_stop_eqs(struct mlx5_core_dev *dev) { struct mlx5_eq_table *table = &dev->priv.eq_table; int err; @@ -785,22 +785,28 @@ int mlx5_stop_eqs(struct mlx5_core_dev * if (MLX5_CAP_GEN(dev, pg)) { err = mlx5_destroy_unmap_eq(dev, &table->pfault_eq); if (err) - return err; + mlx5_core_err(dev, "failed to destroy page fault eq, err(%d)\n", + err); } #endif
err = mlx5_destroy_unmap_eq(dev, &table->pages_eq); if (err) - return err; + mlx5_core_err(dev, "failed to destroy pages eq, err(%d)\n", + err);
- mlx5_destroy_unmap_eq(dev, &table->async_eq); + err = mlx5_destroy_unmap_eq(dev, &table->async_eq); + if (err) + mlx5_core_err(dev, "failed to destroy async eq, err(%d)\n", + err); mlx5_cmd_use_polling(dev);
err = mlx5_destroy_unmap_eq(dev, &table->cmd_eq); - if (err) + if (err) { + mlx5_core_err(dev, "failed to destroy command eq, err(%d)\n", + err); mlx5_cmd_use_events(dev); - - return err; + } }
int mlx5_core_eq_query(struct mlx5_core_dev *dev, struct mlx5_eq *eq, --- a/include/linux/mlx5/driver.h +++ b/include/linux/mlx5/driver.h @@ -1017,7 +1017,7 @@ int mlx5_create_map_eq(struct mlx5_core_ enum mlx5_eq_type type); int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq); int mlx5_start_eqs(struct mlx5_core_dev *dev); -int mlx5_stop_eqs(struct mlx5_core_dev *dev); +void mlx5_stop_eqs(struct mlx5_core_dev *dev); int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn, unsigned int *irqn); int mlx5_core_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid, u32 qpn);
Patches currently in stable-queue which might be from moshe@mellanox.com are
queue-4.14/net-mlx5e-fix-ets-bw-check.patch queue-4.14/net-mlx5-stay-in-polling-mode-when-command-eq-destroy-fails.patch queue-4.14/net-mlx5-cleanup-irqs-in-case-of-unload-failure.patch