netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial flag and adjust the page_count so that pagebuf entries for the RNDIS portion of the message are skipped when it has already been copied into a send buffer. But this adjustment has already been made by code in netvsc_send(). The duplicate adjustment causes some pagebuf entries to not be mapped. In a normal VM, this doesn't break anything because the mapping doesn’t change the PFN. But in a Confidential VM, dma_map_single() does bounce buffering and provides a different PFN. Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, and various errors ensue.
Fix this by removing the duplicate adjustment in netvsc_dma_map() and netvsc_dma_unmap().
Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley mikelley@microsoft.com --- drivers/net/hyperv/netvsc.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 661bbe6..8acfa40 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -949,9 +949,6 @@ static void netvsc_copy_to_send_buf(struct netvsc_device *net_device, void netvsc_dma_unmap(struct hv_device *hv_dev, struct hv_netvsc_packet *packet) { - u32 page_count = packet->cp_partial ? - packet->page_buf_cnt - packet->rmsg_pgcnt : - packet->page_buf_cnt; int i;
if (!hv_is_isolation_supported()) @@ -960,7 +957,7 @@ void netvsc_dma_unmap(struct hv_device *hv_dev, if (!packet->dma_range) return;
- for (i = 0; i < page_count; i++) + for (i = 0; i < packet->page_buf_cnt; i++) dma_unmap_single(&hv_dev->device, packet->dma_range[i].dma, packet->dma_range[i].mapping_size, DMA_TO_DEVICE); @@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) { - u32 page_count = packet->cp_partial ? - packet->page_buf_cnt - packet->rmsg_pgcnt : - packet->page_buf_cnt; + u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
-----Original Message----- From: Michael Kelley (LINUX) mikelley@microsoft.com Sent: Monday, January 30, 2023 10:33 PM To: KY Srinivasan kys@microsoft.com; Haiyang Zhang haiyangz@microsoft.com; wei.liu@kernel.org; Dexuan Cui decui@microsoft.com; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; netdev@vger.kernel.org; linux- hyperv@vger.kernel.org; linux-kernel@vger.kernel.org Cc: Michael Kelley (LINUX) mikelley@microsoft.com; stable@vger.kernel.org Subject: [PATCH net 1/1] hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap()
netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial flag and adjust the page_count so that pagebuf entries for the RNDIS portion of the message are skipped when it has already been copied into a send buffer. But this adjustment has already been made by code in netvsc_send(). The duplicate adjustment causes some pagebuf entries to not be mapped. In a normal VM, this doesn't break anything because the mapping doesn’t change the PFN. But in a Confidential VM, dma_map_single() does bounce buffering and provides a different PFN. Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, and various errors ensue.
Fix this by removing the duplicate adjustment in netvsc_dma_map() and netvsc_dma_unmap().
Fixes: 846da38de0e8 ("net: netvsc: Add Isolation VM support for netvsc driver") Cc: stable@vger.kernel.org Signed-off-by: Michael Kelley mikelley@microsoft.com
Reviewed-by: Haiyang Zhang haiyangz@microsoft.com
Thanks!
On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote:
@@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) {
- u32 page_count = packet->cp_partial ?
packet->page_buf_cnt - packet->rmsg_pgcnt :
packet->page_buf_cnt;
- u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
Suspiciously, the caller still does:
if (packet->cp_partial) pb += packet->rmsg_pgcnt;
ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb);
Shouldn't that if () pb +=... also go away?
From: Jakub Kicinski kuba@kernel.org Sent: Wednesday, February 1, 2023 9:01 PM
On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote:
@@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) {
- u32 page_count = packet->cp_partial ?
packet->page_buf_cnt - packet->rmsg_pgcnt :
packet->page_buf_cnt;
- u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
Suspiciously, the caller still does:
if (packet->cp_partial) pb += packet->rmsg_pgcnt; ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb);
Shouldn't that if () pb +=... also go away?
No -- it's correct.
In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is adjusted. But the pointer into the pagebuf array is not adjusted in netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which brings it back in sync with packet->page_buf_cnt.
I don't know if there's a good reason for the adjustment being split across two different functions. It doesn't seem like the most straightforward approach. From a quick glance at the code it looks like this adjustment to 'pb' could move to netvsc_send() to be together with the adjustment to packet->page_buf_cnt, but maybe there's a reason for the split that I'm not familiar with.
Haiyang -- any insight?
Michael
On Thu, 2023-02-02 at 05:20 +0000, Michael Kelley (LINUX) wrote:
From: Jakub Kicinski kuba@kernel.org Sent: Wednesday, February 1, 2023 9:01 PM
On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote:
@@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) {
- u32 page_count = packet->cp_partial ?
packet->page_buf_cnt - packet->rmsg_pgcnt :
packet->page_buf_cnt;
- u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
Suspiciously, the caller still does:
if (packet->cp_partial) pb += packet->rmsg_pgcnt; ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb);
Shouldn't that if () pb +=... also go away?
No -- it's correct.
In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is adjusted. But the pointer into the pagebuf array is not adjusted in netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which brings it back in sync with packet->page_buf_cnt.
Ok
I don't know if there's a good reason for the adjustment being split across two different functions. It doesn't seem like the most straightforward approach. From a quick glance at the code it looks like this adjustment to 'pb' could move to netvsc_send() to be together with the adjustment to packet->page_buf_cnt, but maybe there's a reason for the split that I'm not familiar with.
Haiyang -- any insight?
While at that, please also have a look at the following allocation in netvsc_dma_map():
packet->dma_range = kcalloc(page_count, sizeof(*packet->dma_range), GFP_KERNEL);
which looks wrong - netvsc_dma_map() should be in atomic context.
Anyway it's a topic unrelated from this patch. I just stumbled upon it while reviewing.
Cheers,
Paolo
From: Paolo Abeni pabeni@redhat.com Sent: Thursday, February 2, 2023 12:31 AM
On Thu, 2023-02-02 at 05:20 +0000, Michael Kelley (LINUX) wrote:
From: Jakub Kicinski kuba@kernel.org Sent: Wednesday, February 1, 2023 9:01 PM
On Mon, 30 Jan 2023 19:33:06 -0800 Michael Kelley wrote:
@@ -990,9 +987,7 @@ static int netvsc_dma_map(struct hv_device *hv_dev, struct hv_netvsc_packet *packet, struct hv_page_buffer *pb) {
- u32 page_count = packet->cp_partial ?
packet->page_buf_cnt - packet->rmsg_pgcnt :
packet->page_buf_cnt;
- u32 page_count = packet->page_buf_cnt; dma_addr_t dma; int i;
Suspiciously, the caller still does:
if (packet->cp_partial) pb += packet->rmsg_pgcnt; ret = netvsc_dma_map(ndev_ctx->device_ctx, packet, pb);
Shouldn't that if () pb +=... also go away?
No -- it's correct.
In netvsc_send(), cp_partial is tested and packet->page_buf_cnt is adjusted. But the pointer into the pagebuf array is not adjusted in netvsc_send(). Instead it is adjusted here in netvsc_send_pkt(), which brings it back in sync with packet->page_buf_cnt.
Ok
I don't know if there's a good reason for the adjustment being split across two different functions. It doesn't seem like the most straightforward approach. From a quick glance at the code it looks like this adjustment to 'pb' could move to netvsc_send() to be together with the adjustment to packet->page_buf_cnt, but maybe there's a reason for the split that I'm not familiar with.
Haiyang -- any insight?
While at that, please also have a look at the following allocation in netvsc_dma_map():
packet->dma_range = kcalloc(page_count, sizeof(*packet->dma_range), GFP_KERNEL);
which looks wrong - netvsc_dma_map() should be in atomic context.
Anyway it's a topic unrelated from this patch. I just stumbled upon it while reviewing.
Thanks for pointing this out. I've made a note to do a fix.
Michael
Hello:
This patch was applied to netdev/net.git (master) by Paolo Abeni pabeni@redhat.com:
On Mon, 30 Jan 2023 19:33:06 -0800 you wrote:
netvsc_dma_map() and netvsc_dma_unmap() currently check the cp_partial flag and adjust the page_count so that pagebuf entries for the RNDIS portion of the message are skipped when it has already been copied into a send buffer. But this adjustment has already been made by code in netvsc_send(). The duplicate adjustment causes some pagebuf entries to not be mapped. In a normal VM, this doesn't break anything because the mapping doesn’t change the PFN. But in a Confidential VM, dma_map_single() does bounce buffering and provides a different PFN. Failing to do the mapping causes the wrong PFN to be passed to Hyper-V, and various errors ensue.
[...]
Here is the summary with links: - [net,1/1] hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap() https://git.kernel.org/netdev/net/c/99f1c46011cc
You are awesome, thank you!
linux-stable-mirror@lists.linaro.org