4.14-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mohammed Gamal mgamal@redhat.com
[ Commit a56d99d714665591fed8527b90eef21530ea61e0 upstream. ]
Prior to commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") the call sequence in netvsc_device_remove() was as follows (as implemented in netvsc_destroy_buf()): 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Teardown receive buffer GPADL 3- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 4- Teardown send buffer GPADL 5- Close vmbus
This didn't work for WS2016 hosts. Commit 0cf737808ae7 ("hv_netvsc: netvsc_teardown_gpadl() split") rearranged the teardown sequence as follows: 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 3- Close vmbus 4- Teardown receive buffer GPADL 5- Teardown send buffer GPADL
That worked well for WS2016 hosts, but it prevented guests on older hosts from shutting down after changing network settings. Commit 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions") ensured the following message sequence for older hosts 1- Send NVSP_MSG1_TYPE_REVOKE_RECV_BUF message 2- Send NVSP_MSG1_TYPE_REVOKE_SEND_BUF message 3- Teardown receive buffer GPADL 4- Teardown send buffer GPADL 5- Close vmbus
However, with this sequence calling `ip link set eth0 mtu 1000` hangs and the process becomes uninterruptible. On futher analysis it turns out that on tearing down the receive buffer GPADL the kernel is waiting indefinitely in vmbus_teardown_gpadl() for a completion to be signaled.
Here is a snippet of where this occurs: int vmbus_teardown_gpadl(struct vmbus_channel *channel, u32 gpadl_handle) { struct vmbus_channel_gpadl_teardown *msg; struct vmbus_channel_msginfo *info; unsigned long flags; int ret;
info = kmalloc(sizeof(*info) + sizeof(struct vmbus_channel_gpadl_teardown), GFP_KERNEL); if (!info) return -ENOMEM;
init_completion(&info->waitevent); info->waiting_channel = channel; [....] ret = vmbus_post_msg(msg, sizeof(struct vmbus_channel_gpadl_teardown), true);
if (ret) goto post_msg_err;
wait_for_completion(&info->waitevent); [....] }
The completion is signaled from vmbus_ongpadl_torndown(), which gets called when the corresponding message is received from the host, which apparently never happens in that case. This patch works around the issue by restoring the first mentioned message sequence for older hosts
Fixes: 0ef58b0a05c1 ("hv_netvsc: change GPAD teardown order on older versions") Signed-off-by: Mohammed Gamal mgamal@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/hyperv/netvsc.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
--- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -571,8 +571,17 @@ void netvsc_device_remove(struct hv_devi = rtnl_dereference(net_device_ctx->nvdev); int i;
+ /* + * Revoke receive buffer. If host is pre-Win2016 then tear down + * receive buffer GPADL. Do the same for send buffer. + */ netvsc_revoke_recv_buf(device, net_device); + if (vmbus_proto_version < VERSION_WIN10) + netvsc_teardown_recv_gpadl(device, net_device); + netvsc_revoke_send_buf(device, net_device); + if (vmbus_proto_version < VERSION_WIN10) + netvsc_teardown_send_gpadl(device, net_device);
RCU_INIT_POINTER(net_device_ctx->nvdev, NULL);
@@ -586,15 +595,13 @@ void netvsc_device_remove(struct hv_devi */ netdev_dbg(ndev, "net device safe to remove\n");
- /* older versions require that buffer be revoked before close */ - if (vmbus_proto_version < VERSION_WIN10) { - netvsc_teardown_recv_gpadl(device, net_device); - netvsc_teardown_send_gpadl(device, net_device); - } - /* Now, we can close the channel safely */ vmbus_close(device->channel);
+ /* + * If host is Win2016 or higher then we do the GPADL tear down + * here after VMBus is closed. + */ if (vmbus_proto_version >= VERSION_WIN10) { netvsc_teardown_recv_gpadl(device, net_device); netvsc_teardown_send_gpadl(device, net_device);