August 2023 - Linux-stable-mirror

[PATCH -stable,4.14 0/1] netfilter stable fix for 4.14

by Pablo Neira Ayuso

Hi Greg, Sasha, This is a backport of: 1689f25924ad ("netfilter: nf_tables: report use refcount overflow") for -stable 4.14. Please, apply. Thanks. Pablo Neira Ayuso (1): netfilter: nf_tables: report use refcount overflow include/net/netfilter/nf_tables.h | 27 +++++- net/netfilter/nf_tables_api.c | 143 +++++++++++++++++++----------- net/netfilter/nft_objref.c | 8 +- 3 files changed, 119 insertions(+), 59 deletions(-) -- 2.30.2

2 years, 4 months

2
2
0 0

[PATCH 5.15.y 5.10.y 5.4.y 1/2] nvme-tcp: fix potential unbalanced freeze & unfreeze

by Sagi Grimberg

From: Ming Lei <ming.lei(a)redhat.com> Move start_freeze into nvme_tcp_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Tested-by: Yi Zhang <yi.zhang(a)redhat.com> Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Keith Busch <kbusch(a)kernel.org> --- drivers/nvme/host/tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 96d8d7844e84..c2e037644ad1 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1882,6 +1882,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) goto out_cleanup_connect_q; if (!new) { + nvme_start_freeze(ctrl); nvme_start_queues(ctrl); if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) { /* @@ -1890,6 +1891,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->tagset, @@ -2008,7 +2010,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, if (ctrl->queue_count <= 1) return; blk_mq_quiesce_queue(ctrl->admin_q); - nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); nvme_sync_io_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); -- 2.41.0

2 years, 4 months

2
2
0 0

[PATCH 6.1.y 1/2] nvme-tcp: fix potential unbalanced freeze & unfreeze

by Sagi Grimberg

From: Ming Lei <ming.lei(a)redhat.com> Move start_freeze into nvme_tcp_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Tested-by: Yi Zhang <yi.zhang(a)redhat.com> Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Keith Busch <kbusch(a)kernel.org> --- drivers/nvme/host/tcp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 1dc7c733c7e3..8d67cdd844f5 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1884,6 +1884,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) goto out_cleanup_connect_q; if (!new) { + nvme_start_freeze(ctrl); nvme_start_queues(ctrl); if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) { /* @@ -1892,6 +1893,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->tagset, @@ -1996,7 +1998,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, if (ctrl->queue_count <= 1) return; nvme_stop_admin_queue(ctrl); - nvme_start_freeze(ctrl); nvme_stop_queues(ctrl); nvme_sync_io_queues(ctrl); nvme_tcp_stop_io_queues(ctrl); -- 2.41.0

2 years, 4 months

2
2
0 0

Re: Patch "vlan: Fix VLAN 0 memory leak" has been added to the 6.4-stable tree

by Ido Schimmel

+ stable On Sat, Aug 12, 2023 at 08:02:46PM +0200, gregkh(a)linuxfoundation.org wrote: > > This is a note to let you know that I've just added the patch titled > > vlan: Fix VLAN 0 memory leak > > to the 6.4-stable tree which can be found at: > http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum… > > The filename of the patch is: > vlan-fix-vlan-0-memory-leak.patch > and it can be found in the queue-6.4 subdirectory. > > If you, or anyone else, feels it should not be added to the stable tree, > please let <stable(a)vger.kernel.org> know about it. Please do not add the patch to the stable tree. A problem was found and a revert was posted: https://patchwork.kernel.org/project/netdevbpf/patch/20230811154523.1877590… In addition to 6.4, please do not apply to: 6.1, 5.15, 5.10, 5.4, 4.19, 4.14 Thanks > > > From 718cb09aaa6fa78cc8124e9517efbc6c92665384 Mon Sep 17 00:00:00 2001 > From: Vlad Buslov <vladbu(a)nvidia.com> > Date: Tue, 8 Aug 2023 11:35:21 +0200 > Subject: vlan: Fix VLAN 0 memory leak > > From: Vlad Buslov <vladbu(a)nvidia.com> > > commit 718cb09aaa6fa78cc8124e9517efbc6c92665384 upstream. > > The referenced commit intended to fix memleak of VLAN 0 that is implicitly > created on devices with NETIF_F_HW_VLAN_CTAG_FILTER feature. However, it > doesn't take into account that the feature can be re-set during the > netdevice lifetime which will cause memory leak if feature is disabled > during the device deletion as illustrated by [0]. Fix the leak by > unconditionally deleting VLAN 0 on NETDEV_DOWN event. > > [0]: > > modprobe 8021q > > ip l set dev eth2 up > > ethtool -K eth2 rx-vlan-filter off > > modprobe -r mlx5_ib > > modprobe -r mlx5_core > > cat /sys/kernel/debug/kmemleak > unreferenced object 0xffff888103dcd900 (size 256): > comm "ip", pid 1490, jiffies 4294907305 (age 325.364s) > hex dump (first 32 bytes): > 00 80 5d 03 81 88 ff ff 00 00 00 00 00 00 00 00 ..]............. > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<00000000899f3bb9>] kmalloc_trace+0x25/0x80 > [<000000002889a7a2>] vlan_vid_add+0xa0/0x210 > [<000000007177800e>] vlan_device_event+0x374/0x760 [8021q] > [<000000009a0716b1>] notifier_call_chain+0x35/0xb0 > [<00000000bbf3d162>] __dev_notify_flags+0x58/0xf0 > [<0000000053d2b05d>] dev_change_flags+0x4d/0x60 > [<00000000982807e9>] do_setlink+0x28d/0x10a0 > [<0000000058c1be00>] __rtnl_newlink+0x545/0x980 > [<00000000e66c3bd9>] rtnl_newlink+0x44/0x70 > [<00000000a2cc5970>] rtnetlink_rcv_msg+0x29c/0x390 > [<00000000d307d1e4>] netlink_rcv_skb+0x54/0x100 > [<00000000259d16f9>] netlink_unicast+0x1f6/0x2c0 > [<000000007ce2afa1>] netlink_sendmsg+0x232/0x4a0 > [<00000000f3f4bb39>] sock_sendmsg+0x38/0x60 > [<000000002f9c0624>] ____sys_sendmsg+0x1e3/0x200 > [<00000000d6ff5520>] ___sys_sendmsg+0x80/0xc0 > unreferenced object 0xffff88813354fde0 (size 32): > comm "ip", pid 1490, jiffies 4294907305 (age 325.364s) > hex dump (first 32 bytes): > a0 d9 dc 03 81 88 ff ff a0 d9 dc 03 81 88 ff ff ................ > 81 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<00000000899f3bb9>] kmalloc_trace+0x25/0x80 > [<000000002da64724>] vlan_vid_add+0xdf/0x210 > [<000000007177800e>] vlan_device_event+0x374/0x760 [8021q] > [<000000009a0716b1>] notifier_call_chain+0x35/0xb0 > [<00000000bbf3d162>] __dev_notify_flags+0x58/0xf0 > [<0000000053d2b05d>] dev_change_flags+0x4d/0x60 > [<00000000982807e9>] do_setlink+0x28d/0x10a0 > [<0000000058c1be00>] __rtnl_newlink+0x545/0x980 > [<00000000e66c3bd9>] rtnl_newlink+0x44/0x70 > [<00000000a2cc5970>] rtnetlink_rcv_msg+0x29c/0x390 > [<00000000d307d1e4>] netlink_rcv_skb+0x54/0x100 > [<00000000259d16f9>] netlink_unicast+0x1f6/0x2c0 > [<000000007ce2afa1>] netlink_sendmsg+0x232/0x4a0 > [<00000000f3f4bb39>] sock_sendmsg+0x38/0x60 > [<000000002f9c0624>] ____sys_sendmsg+0x1e3/0x200 > [<00000000d6ff5520>] ___sys_sendmsg+0x80/0xc0 > > Fixes: efc73f4bbc23 ("net: Fix memory leak - vlan_info struct") > Reviewed-by: Ido Schimmel <idosch(a)nvidia.com> > Signed-off-by: Vlad Buslov <vladbu(a)nvidia.com> > Link: https://lore.kernel.org/r/20230808093521.1468929-1-vladbu@nvidia.com > Signed-off-by: Jakub Kicinski <kuba(a)kernel.org> > Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org> > --- > net/8021q/vlan.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > --- a/net/8021q/vlan.c > +++ b/net/8021q/vlan.c > @@ -384,8 +384,7 @@ static int vlan_device_event(struct noti > dev->name); > vlan_vid_add(dev, htons(ETH_P_8021Q), 0); > } > - if (event == NETDEV_DOWN && > - (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)) > + if (event == NETDEV_DOWN) > vlan_vid_del(dev, htons(ETH_P_8021Q), 0); > > vlan_info = rtnl_dereference(dev->vlan_info); > > > Patches currently in stable-queue which might be from vladbu(a)nvidia.com are > > queue-6.4/vlan-fix-vlan-0-memory-leak.patch

2 years, 4 months

2
1
0 0

[PATCH AUTOSEL 4.19 01/13] 9p: virtio: make sure 'offs' is initialized in zc_request

by Sasha Levin

From: Dominique Martinet <asmadeus(a)codewreck.org> [ Upstream commit 4a73edab69d3a6623f03817fe950a2d9585f80e4 ] Similarly to the previous patch: offs can be used in handle_rerrors without initializing on small payloads; in this case handle_rerrors will not use it because of the size check, but it doesn't hurt to make sure it is zero to please scan-build. This fixes the following warning: net/9p/trans_virtio.c:539:3: warning: 3rd function call argument is an uninitialized value [core.CallAndMessage] handle_rerror(req, in_hdr_len, offs, in_pages); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Simon Horman <simon.horman(a)corigine.com> Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/9p/trans_virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c index f7cd8e018bde0..6b3357a77d992 100644 --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -409,7 +409,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, struct page **in_pages = NULL, **out_pages = NULL; struct virtio_chan *chan = client->trans; struct scatterlist *sgs[4]; - size_t offs; + size_t offs = 0; int need_drop = 0; int kicked = 0; -- 2.40.1

2 years, 4 months

1
12
0 0

[PATCH AUTOSEL 5.4 01/14] 9p: virtio: make sure 'offs' is initialized in zc_request

by Sasha Levin

From: Dominique Martinet <asmadeus(a)codewreck.org> [ Upstream commit 4a73edab69d3a6623f03817fe950a2d9585f80e4 ] Similarly to the previous patch: offs can be used in handle_rerrors without initializing on small payloads; in this case handle_rerrors will not use it because of the size check, but it doesn't hurt to make sure it is zero to please scan-build. This fixes the following warning: net/9p/trans_virtio.c:539:3: warning: 3rd function call argument is an uninitialized value [core.CallAndMessage] handle_rerror(req, in_hdr_len, offs, in_pages); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Simon Horman <simon.horman(a)corigine.com> Signed-off-by: Dominique Martinet <asmadeus(a)codewreck.org> Signed-off-by: Eric Van Hensbergen <ericvh(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- net/9p/trans_virtio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c index f582351d84ecb..36b5f72e2165c 100644 --- a/net/9p/trans_virtio.c +++ b/net/9p/trans_virtio.c @@ -394,7 +394,7 @@ p9_virtio_zc_request(struct p9_client *client, struct p9_req_t *req, struct page **in_pages = NULL, **out_pages = NULL; struct virtio_chan *chan = client->trans; struct scatterlist *sgs[4]; - size_t offs; + size_t offs = 0; int need_drop = 0; int kicked = 0; -- 2.40.1

2 years, 4 months

1
13
0 0

[PATCH AUTOSEL 6.1 01/47] phy: qcom-snps-femto-v2: keep cfg_ahb_clk enabled during runtime suspend

by Sasha Levin

From: Adrien Thierry <athierry(a)redhat.com> [ Upstream commit 45d89a344eb46db9dce851c28e14f5e3c635c251 ] In the dwc3 core, both system and runtime suspend end up calling dwc3_suspend_common(). From there, what happens for the PHYs depends on the USB mode and whether the controller is entering system or runtime suspend. HOST mode: (1) system suspend on a non-wakeup-capable controller The [1] if branch is taken. dwc3_core_exit() is called, which ends up calling phy_power_off() and phy_exit(). Those two functions decrease the PM runtime count at some point, so they will trigger the PHY runtime sleep (assuming the count is right). (2) runtime suspend / system suspend on a wakeup-capable controller The [1] branch is not taken. dwc3_suspend_common() calls phy_pm_runtime_put_sync(). Assuming the ref count is right, the PHY runtime suspend op is called. DEVICE mode: dwc3_core_exit() is called on both runtime and system sleep unless the controller is already runtime suspended. OTG mode: (1) system suspend : dwc3_core_exit() is called (2) runtime suspend : do nothing In host mode, the code seems to make a distinction between 1) runtime sleep / system sleep for wakeup-capable controller, and 2) system sleep for non-wakeup-capable controller, where phy_power_off() and phy_exit() are only called for the latter. This suggests the PHY is not supposed to be in a fully powered-off state for runtime sleep and system sleep for wakeup-capable controller. Moreover, downstream, cfg_ahb_clk only gets disabled for system suspend. The clocks are disabled by phy->set_suspend() [2] which is only called in the system sleep path through dwc3_core_exit() [3]. With that in mind, don't disable the clocks during the femto PHY runtime suspend callback. The clocks will only be disabled during system suspend for non-wakeup-capable controllers, through dwc3_core_exit(). [1] https://elixir.bootlin.com/linux/v6.4/source/drivers/usb/dwc3/core.c#L1988 [2] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-0530… [3] https://git.codelinaro.org/clo/la/kernel/msm-5.4/-/blob/LV.AU.1.2.1.r2-0530… Signed-off-by: Adrien Thierry <athierry(a)redhat.com> Link: https://lore.kernel.org/r/20230629144542.14906-2-athierry@redhat.com Signed-off-by: Vinod Koul <vkoul(a)kernel.org> Signed-off-by: Sasha Levin <sashal(a)kernel.org> --- drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c | 9 --------- 1 file changed, 9 deletions(-) diff --git a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c index a590635962140..1d45029b19cd5 100644 --- a/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c +++ b/drivers/phy/qualcomm/phy-qcom-snps-femto-v2.c @@ -165,22 +165,13 @@ static int qcom_snps_hsphy_suspend(struct qcom_snps_hsphy *hsphy) 0, USB2_AUTO_RESUME); } - clk_disable_unprepare(hsphy->cfg_ahb_clk); return 0; } static int qcom_snps_hsphy_resume(struct qcom_snps_hsphy *hsphy) { - int ret; - dev_dbg(&hsphy->phy->dev, "Resume QCOM SNPS PHY, mode\n"); - ret = clk_prepare_enable(hsphy->cfg_ahb_clk); - if (ret) { - dev_err(&hsphy->phy->dev, "failed to enable cfg ahb clock\n"); - return ret; - } - return 0; } -- 2.40.1

2 years, 4 months

1
46
0 0

FAILED: patch "[PATCH] nvme-tcp: fix potential unbalanced freeze & unfreeze" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 99dc264014d5aed66ee37ddf136a38b5a2b1b529 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081226-oak-cartoon-6115@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 99dc264014d5 ("nvme-tcp: fix potential unbalanced freeze & unfreeze") 9f27bd701d18 ("nvme: rename the queue quiescing helpers") 91c11d5f3254 ("nvme-rdma: stop auth work after tearing down queues in error recovery") 1f1a4f89562d ("nvme-tcp: stop auth work after tearing down queues in error recovery") eac3ef262941 ("nvme-pci: split the initial probe from the rest path") a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") 081a7d958ce4 ("nvme-pci: factor the iod mempool creation into a helper") 94cc781f69f4 ("nvme: move OPAL setup from PCIe to core") cd50f9b24726 ("nvme: split nvme_kill_queues") 6bcd5089ee13 ("nvme: don't unquiesce the admin queue in nvme_kill_queues") 0ffc7e98bfaa ("nvme-pci: refactor the tagset handling in nvme_reset_work") 71b26083d59c ("block: set the disk capacity to 0 in blk_mark_disk_dead") 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") 1864ea46155c ("nvme-fc: store the generic nvme_ctrl in set->driver_data") cefa1032f111 ("nvme-rdma: use the tagset alloc/free helpers") 2d60738c8f80 ("nvme-rdma: store the generic nvme_ctrl in set->driver_data") fe60e8c53411 ("nvme: add common helpers to allocate and free tagsets") 61ce339f19fa ("nvme-pci: set min_align_mask before calculating max_hw_sectors") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 99dc264014d5aed66ee37ddf136a38b5a2b1b529 Mon Sep 17 00:00:00 2001 From: Ming Lei <ming.lei(a)redhat.com> Date: Tue, 11 Jul 2023 17:40:40 +0800 Subject: [PATCH] nvme-tcp: fix potential unbalanced freeze & unfreeze Move start_freeze into nvme_tcp_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 2875b0aecabe ("nvme-tcp: fix controller reset hang during traffic") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Tested-by: Yi Zhang <yi.zhang(a)redhat.com> Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Keith Busch <kbusch(a)kernel.org> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 3e7dd6f91832..fb24cd8ac46c 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1868,6 +1868,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) goto out_cleanup_connect_q; if (!new) { + nvme_start_freeze(ctrl); nvme_unquiesce_io_queues(ctrl); if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) { /* @@ -1876,6 +1877,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->tagset, @@ -1980,7 +1982,6 @@ static void nvme_tcp_teardown_io_queues(struct nvme_ctrl *ctrl, if (ctrl->queue_count <= 1) return; nvme_quiesce_admin_queue(ctrl); - nvme_start_freeze(ctrl); nvme_quiesce_io_queues(ctrl); nvme_sync_io_queues(ctrl); nvme_tcp_stop_io_queues(ctrl);

2 years, 4 months

2
2
0 0

FAILED: patch "[PATCH] nvme-rdma: fix potential unbalanced freeze & unfreeze" failed to apply to 5.10-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y git checkout FETCH_HEAD git cherry-pick -x 29b434d1e49252b3ad56ad3197e47fafff5356a1 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081243-sleet-native-6d03@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^.. Possible dependencies: 29b434d1e492 ("nvme-rdma: fix potential unbalanced freeze & unfreeze") 9f27bd701d18 ("nvme: rename the queue quiescing helpers") 91c11d5f3254 ("nvme-rdma: stop auth work after tearing down queues in error recovery") 1f1a4f89562d ("nvme-tcp: stop auth work after tearing down queues in error recovery") eac3ef262941 ("nvme-pci: split the initial probe from the rest path") a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") 081a7d958ce4 ("nvme-pci: factor the iod mempool creation into a helper") 94cc781f69f4 ("nvme: move OPAL setup from PCIe to core") cd50f9b24726 ("nvme: split nvme_kill_queues") 6bcd5089ee13 ("nvme: don't unquiesce the admin queue in nvme_kill_queues") 0ffc7e98bfaa ("nvme-pci: refactor the tagset handling in nvme_reset_work") 71b26083d59c ("block: set the disk capacity to 0 in blk_mark_disk_dead") 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") 1864ea46155c ("nvme-fc: store the generic nvme_ctrl in set->driver_data") cefa1032f111 ("nvme-rdma: use the tagset alloc/free helpers") 2d60738c8f80 ("nvme-rdma: store the generic nvme_ctrl in set->driver_data") fe60e8c53411 ("nvme: add common helpers to allocate and free tagsets") 61ce339f19fa ("nvme-pci: set min_align_mask before calculating max_hw_sectors") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 29b434d1e49252b3ad56ad3197e47fafff5356a1 Mon Sep 17 00:00:00 2001 From: Ming Lei <ming.lei(a)redhat.com> Date: Tue, 11 Jul 2023 17:40:41 +0800 Subject: [PATCH] nvme-rdma: fix potential unbalanced freeze & unfreeze Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Tested-by: Yi Zhang <yi.zhang(a)redhat.com> Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Keith Busch <kbusch(a)kernel.org> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index d433b2ec07a6..337a624a537c 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -883,6 +883,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) goto out_cleanup_tagset; if (!new) { + nvme_start_freeze(&ctrl->ctrl); nvme_unquiesce_io_queues(&ctrl->ctrl); if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) { /* @@ -891,6 +892,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(&ctrl->ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset, @@ -940,7 +942,6 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, bool remove) { if (ctrl->ctrl.queue_count > 1) { - nvme_start_freeze(&ctrl->ctrl); nvme_quiesce_io_queues(&ctrl->ctrl); nvme_sync_io_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl);

2 years, 4 months

2
1
0 0

FAILED: patch "[PATCH] nvme-rdma: fix potential unbalanced freeze & unfreeze" failed to apply to 5.15-stable tree

by gregkh＠linuxfoundation.org

The patch below does not apply to the 5.15-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable(a)vger.kernel.org>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.15.y git checkout FETCH_HEAD git cherry-pick -x 29b434d1e49252b3ad56ad3197e47fafff5356a1 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2023081242-sage-caddy-ebf9@gregkh' --subject-prefix 'PATCH 5.15.y' HEAD^.. Possible dependencies: 29b434d1e492 ("nvme-rdma: fix potential unbalanced freeze & unfreeze") 9f27bd701d18 ("nvme: rename the queue quiescing helpers") 91c11d5f3254 ("nvme-rdma: stop auth work after tearing down queues in error recovery") 1f1a4f89562d ("nvme-tcp: stop auth work after tearing down queues in error recovery") eac3ef262941 ("nvme-pci: split the initial probe from the rest path") a6ee7f19ebfd ("nvme-pci: call nvme_pci_configure_admin_queue from nvme_pci_enable") 3f30a79c2e2c ("nvme-pci: set constant paramters in nvme_pci_alloc_ctrl") 2e87570be9d2 ("nvme-pci: factor out a nvme_pci_alloc_dev helper") 081a7d958ce4 ("nvme-pci: factor the iod mempool creation into a helper") 94cc781f69f4 ("nvme: move OPAL setup from PCIe to core") cd50f9b24726 ("nvme: split nvme_kill_queues") 6bcd5089ee13 ("nvme: don't unquiesce the admin queue in nvme_kill_queues") 0ffc7e98bfaa ("nvme-pci: refactor the tagset handling in nvme_reset_work") 71b26083d59c ("block: set the disk capacity to 0 in blk_mark_disk_dead") 6dfba1c09c10 ("nvme-fc: use the tagset alloc/free helpers") 1864ea46155c ("nvme-fc: store the generic nvme_ctrl in set->driver_data") cefa1032f111 ("nvme-rdma: use the tagset alloc/free helpers") 2d60738c8f80 ("nvme-rdma: store the generic nvme_ctrl in set->driver_data") fe60e8c53411 ("nvme: add common helpers to allocate and free tagsets") 61ce339f19fa ("nvme-pci: set min_align_mask before calculating max_hw_sectors") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ From 29b434d1e49252b3ad56ad3197e47fafff5356a1 Mon Sep 17 00:00:00 2001 From: Ming Lei <ming.lei(a)redhat.com> Date: Tue, 11 Jul 2023 17:40:41 +0800 Subject: [PATCH] nvme-rdma: fix potential unbalanced freeze & unfreeze Move start_freeze into nvme_rdma_configure_io_queues(), and there is at least two benefits: 1) fix unbalanced freeze and unfreeze, since re-connection work may fail or be broken by removal 2) IO during error recovery can be failfast quickly because nvme fabrics unquiesces queues after teardown. One side-effect is that !mpath request may timeout during connecting because of queue topo change, but that looks not one big deal: 1) same problem exists with current code base 2) compared with !mpath, mpath use case is dominant Fixes: 9f98772ba307 ("nvme-rdma: fix controller reset hang during traffic") Cc: stable(a)vger.kernel.org Signed-off-by: Ming Lei <ming.lei(a)redhat.com> Tested-by: Yi Zhang <yi.zhang(a)redhat.com> Reviewed-by: Sagi Grimberg <sagi(a)grimberg.me> Signed-off-by: Keith Busch <kbusch(a)kernel.org> diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index d433b2ec07a6..337a624a537c 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -883,6 +883,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) goto out_cleanup_tagset; if (!new) { + nvme_start_freeze(&ctrl->ctrl); nvme_unquiesce_io_queues(&ctrl->ctrl); if (!nvme_wait_freeze_timeout(&ctrl->ctrl, NVME_IO_TIMEOUT)) { /* @@ -891,6 +892,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new) * to be safe. */ ret = -ENODEV; + nvme_unfreeze(&ctrl->ctrl); goto out_wait_freeze_timed_out; } blk_mq_update_nr_hw_queues(ctrl->ctrl.tagset, @@ -940,7 +942,6 @@ static void nvme_rdma_teardown_io_queues(struct nvme_rdma_ctrl *ctrl, bool remove) { if (ctrl->ctrl.queue_count > 1) { - nvme_start_freeze(&ctrl->ctrl); nvme_quiesce_io_queues(&ctrl->ctrl); nvme_sync_io_queues(&ctrl->ctrl); nvme_rdma_stop_io_queues(ctrl);

2 years, 4 months

2
1
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror August 2023