mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be multiple of 64. So a packet slightly bigger than mtu+14, say 1536, can be received and cause skb_over_panic.
Sample dmesg: [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL> [ 5325.243689] ------------[ cut here ]------------ [ 5325.245748] kernel BUG at net/core/skbuff.c:192! [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60 [ 5325.302941] Call Trace: [ 5325.304389] <IRQ> [ 5325.315794] ? skb_panic+0x4f/0x60 [ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30 [ 5325.319490] ? skb_panic+0x4f/0x60 [ 5325.321161] skb_put+0x4e/0x50 [ 5325.322670] mana_poll+0x6fa/0xb50 [mana] [ 5325.324578] __napi_poll+0x33/0x1e0 [ 5325.326328] net_rx_action+0x12e/0x280
As discussed internally, this alignment is not necessary. To fix this bug, remove it from the code. So oversized packets will be marked as CQE_RX_TRUNCATED by NIC, and dropped.
Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)") Signed-off-by: Haiyang Zhang haiyangz@microsoft.com --- drivers/net/ethernet/microsoft/mana/mana_en.c | 2 +- include/net/mana/mana.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 59287c6e6cee..d8af5e7e15b4 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -601,7 +601,7 @@ static void mana_get_rxbuf_cfg(int mtu, u32 *datasize, u32 *alloc_size,
*alloc_size = mtu + MANA_RXBUF_PAD + *headroom;
- *datasize = ALIGN(mtu + ETH_HLEN, MANA_RX_DATA_ALIGN); + *datasize = mtu + ETH_HLEN; }
static int mana_pre_alloc_rxbufs(struct mana_port_context *mpc, int new_mtu) diff --git a/include/net/mana/mana.h b/include/net/mana/mana.h index 76147feb0d10..4eeedf14711b 100644 --- a/include/net/mana/mana.h +++ b/include/net/mana/mana.h @@ -39,7 +39,6 @@ enum TRI_STATE { #define COMP_ENTRY_SIZE 64
#define RX_BUFFERS_PER_QUEUE 512 -#define MANA_RX_DATA_ALIGN 64
#define MAX_SEND_BUFFERS_PER_QUEUE 256
From: LKML haiyangz lkmlhyz@microsoft.com On Behalf Of Haiyang Zhang Sent: Friday, March 29, 2024 2:37 PM [...] mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be multiple of 64. So a packet slightly bigger than mtu+14, say 1536, can be received and cause skb_over_panic.
Sample dmesg: [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL> [ 5325.243689] ------------[ cut here ]------------ [ 5325.245748] kernel BUG at net/core/skbuff.c:192! [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60 [ 5325.302941] Call Trace: [ 5325.304389] <IRQ> [ 5325.315794] ? skb_panic+0x4f/0x60 [ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30 [ 5325.319490] ? skb_panic+0x4f/0x60 [ 5325.321161] skb_put+0x4e/0x50 [ 5325.322670] mana_poll+0x6fa/0xb50 [mana] [ 5325.324578] __napi_poll+0x33/0x1e0 [ 5325.326328] net_rx_action+0x12e/0x280
As discussed internally, this alignment is not necessary. To fix this bug, remove it from the code. So oversized packets will be marked as CQE_RX_TRUNCATED by NIC, and dropped.
Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
While the unnecessary alignment indeed first appeared in ca9c54d2d6a5 (Apr 2021), ca9c54d2d6a5 didn't cause any crash because the only supported MTU size was 1500, and the RX buffer was a page (which is big enough to hold an incoming packet of a size up to 1536 bytes), and mana_rx_skb() created a big skb of 1 page (which is big enough to hold the incoming packet); the only issue with ca9c54d2d6a5 was that an incoming packet of 1515~1536 bytes (if such a packet was ever received by the NIC) was still properly delivered to the upper layer network stack, while ideally such a packet should be dropped by the NIC driver as we would have received CQE_RX_TRUNCATED if ca9c54d2d6a5 hadn't done the unnecessary alignment.
So, my understanding is that ca9c54d2d6a5 is imperfect but it doesn't really cause any real issue.
It looks like the real issue started in the commit (Apr 12 14:16:02 2023) 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes") which introduces "rxq->alloc_size", which I think can be smaller than rxq->datasize, and is used in mana_poll() --> ... -> mana_build_skb(), which I think causes the crash because the size of the allocated skb may not be able to hold a big incoming packet.
Later, the commit (Apr 12 14:16:03 2023) 80f6215b450e ("net: mana: Add support for jumbo frame") does code refactoring and introduces mana_get_rxbuf_cfg().
I suggest the Fixes tag should be updated to: Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes") , because if we used "Fixes: ca9c54d2d6a5", the distro vendors would ask us: does this fix need to be backported to old kernels that don't have 2fbbd712baf1? The fix can't apply cleanly there and is not really needed there. The stable tree maintainers would ask the same question.
drivers/net/ethernet/microsoft/mana/mana_en.c | 2 +- include/net/mana/mana.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 59287c6e6cee..d8af5e7e15b4 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -601,7 +601,7 @@ static void mana_get_rxbuf_cfg(int mtu, u32 *datasize, u32 *alloc_size,
*alloc_size = mtu + MANA_RXBUF_PAD + *headroom;
- *datasize = ALIGN(mtu + ETH_HLEN, MANA_RX_DATA_ALIGN);
- *datasize = mtu + ETH_HLEN;
}
I suggest the Fixes tag should be updated. Otherwise the fix looks good to me.
Reviewed-by: Dexuan Cui decui@microsoft.com
-----Original Message----- From: Dexuan Cui decui@microsoft.com Sent: Friday, March 29, 2024 8:12 PM To: Haiyang Zhang haiyangz@microsoft.com; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org Cc: stephen stephen@networkplumber.org; KY Srinivasan kys@microsoft.com; Paul Rosswurm paulros@microsoft.com; olaf@aepfle.de; vkuznets@redhat.com; davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; leon@kernel.org; Long Li longli@microsoft.com; ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; ast@kernel.org; sharmaajay@microsoft.com; hawk@kernel.org; tglx@linutronix.de; shradhagupta@linux.microsoft.com; linux- kernel@vger.kernel.org; stable@vger.kernel.org Subject: RE: [PATCH net] net: mana: Fix Rx DMA datasize and skb_over_panic
From: LKML haiyangz lkmlhyz@microsoft.com On Behalf Of Haiyang Zhang Sent: Friday, March 29, 2024 2:37 PM [...] mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be multiple of 64. So a packet slightly bigger than mtu+14, say 1536, can be received and cause skb_over_panic.
Sample dmesg: [ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL> [ 5325.243689] ------------[ cut here ]------------ [ 5325.245748] kernel BUG at net/core/skbuff.c:192! [ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60 [ 5325.302941] Call Trace: [ 5325.304389] <IRQ> [ 5325.315794] ? skb_panic+0x4f/0x60 [ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30 [ 5325.319490] ? skb_panic+0x4f/0x60 [ 5325.321161] skb_put+0x4e/0x50 [ 5325.322670] mana_poll+0x6fa/0xb50 [mana] [ 5325.324578] __napi_poll+0x33/0x1e0 [ 5325.326328] net_rx_action+0x12e/0x280
As discussed internally, this alignment is not necessary. To fix this bug, remove it from the code. So oversized packets will be marked as CQE_RX_TRUNCATED by NIC, and dropped.
Cc: stable@vger.kernel.org Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure
Network
Adapter (MANA)")
While the unnecessary alignment indeed first appeared in ca9c54d2d6a5 (Apr 2021), ca9c54d2d6a5 didn't cause any crash because the only supported MTU size was 1500, and the RX buffer was a page (which is big enough to hold an incoming packet of a size up to 1536 bytes), and mana_rx_skb() created a big skb of 1 page (which is big enough to hold the incoming packet); the only issue with ca9c54d2d6a5 was that an incoming packet of 1515~1536 bytes (if such a packet was ever received by the NIC) was still properly delivered to the upper layer network stack, while ideally such a packet should be dropped by the NIC driver as we would have received CQE_RX_TRUNCATED if ca9c54d2d6a5 hadn't done the unnecessary alignment.
So, my understanding is that ca9c54d2d6a5 is imperfect but it doesn't really cause any real issue.
It looks like the real issue started in the commit (Apr 12 14:16:02 2023) 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes") which introduces "rxq->alloc_size", which I think can be smaller than rxq->datasize, and is used in mana_poll() --> ... -> mana_build_skb(), which I think causes the crash because the size of the allocated skb may not be able to hold a big incoming packet.
Later, the commit (Apr 12 14:16:03 2023) 80f6215b450e ("net: mana: Add support for jumbo frame") does code refactoring and introduces mana_get_rxbuf_cfg().
I suggest the Fixes tag should be updated to: Fixes: 2fbbd712baf1 ("net: mana: Enable RX path to handle various MTU sizes") , because if we used "Fixes: ca9c54d2d6a5", the distro vendors would ask us: does this fix need to be backported to old kernels that don't have 2fbbd712baf1? The fix can't apply cleanly there and is not really needed there. The stable tree maintainers would ask the same question.
drivers/net/ethernet/microsoft/mana/mana_en.c | 2 +- include/net/mana/mana.h | 1 - 2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index 59287c6e6cee..d8af5e7e15b4 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -601,7 +601,7 @@ static void mana_get_rxbuf_cfg(int mtu, u32 *datasize, u32 *alloc_size,
*alloc_size = mtu + MANA_RXBUF_PAD + *headroom;
- *datasize = ALIGN(mtu + ETH_HLEN, MANA_RX_DATA_ALIGN);
- *datasize = mtu + ETH_HLEN;
}
I suggest the Fixes tag should be updated. Otherwise the fix looks good to me.
Thanks for the suggestion. I actually thought about this before submission. I was worried about someone back ports the jumbo frame feature, they may not automatically know this patch should be backported too. Also, I suspect that a bigger than MTU packet may cause unexpected problem at NVA application.
If anyone have questions on back porting, I can provide a back ported patch, which is just one line change.
- Haiyang
From: Haiyang Zhang haiyangz@microsoft.com Sent: Monday, April 1, 2024 4:21 PM
[...] I suggest the Fixes tag should be updated. Otherwise the fix looks good to me.
Thanks for the suggestion. I actually thought about this before submission. I was worried about someone back ports the jumbo frame feature, they may not automatically know this patch should be backported too.
The jumbo frame commit (2fbbd712baf1) depends on the MTU commit (2fbbd712baf1), so adding "Fixes: 2fbbd712baf1" ( instead of "Fixes: ca9c54d2d6a5") might make it easier for people to notice and pick up this fix.
I'm OK if the patch remains as is. Just wanted to make sure I understand the issue here.
Also, I suspect that a bigger than MTU packet may cause unexpected problem at NVA application.
Good point. + Wei Hu, who can check the FreeBSD MANA driver and the DPDK MANA PMD. The code there may also need to be fixed.
If anyone have questions on back porting, I can provide a back ported patch, which is just one line change.
- Haiyang
If the patch remains as is, gregkg will send us "failed to apply" emails for v6.1.y and v5.15.y and we'll need to make a backport for the 2 stable kernels.
With "Fixes: 2fbbd712baf1", we won't receive the "failed to apply" emails.
Thanks, Dexuan
On Tue, 2 Apr 2024 01:23:08 +0000 Dexuan Cui wrote:
I suggest the Fixes tag should be updated. Otherwise the fix looks good to me.
Thanks for the suggestion. I actually thought about this before submission. I was worried about someone back ports the jumbo frame feature, they may not automatically know this patch should be backported too.
The jumbo frame commit (2fbbd712baf1) depends on the MTU commit (2fbbd712baf1), so adding "Fixes: 2fbbd712baf1" ( instead of "Fixes: ca9c54d2d6a5") might make it easier for people to notice and pick up this fix.
I'm OK if the patch remains as is. Just wanted to make sure I understand the issue here.
Please update the tag to where the bug was actually first exposed.
-----Original Message----- From: Jakub Kicinski kuba@kernel.org Sent: Tuesday, April 2, 2024 12:13 AM To: Dexuan Cui decui@microsoft.com Cc: Haiyang Zhang haiyangz@microsoft.com; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; Wei Hu weh@microsoft.com; stephen stephen@networkplumber.org; KY Srinivasan kys@microsoft.com; Paul Rosswurm paulros@microsoft.com; olaf@aepfle.de; vkuznets@redhat.com; davem@davemloft.net; wei.liu@kernel.org; edumazet@google.com; pabeni@redhat.com; leon@kernel.org; Long Li longli@microsoft.com; ssengar@linux.microsoft.com; linux-rdma@vger.kernel.org; daniel@iogearbox.net; john.fastabend@gmail.com; bpf@vger.kernel.org; ast@kernel.org; sharmaajay@microsoft.com; hawk@kernel.org; tglx@linutronix.de; shradhagupta@linux.microsoft.com; linux- kernel@vger.kernel.org; stable@vger.kernel.org Subject: Re: [PATCH net] net: mana: Fix Rx DMA datasize and skb_over_panic
On Tue, 2 Apr 2024 01:23:08 +0000 Dexuan Cui wrote:
I suggest the Fixes tag should be updated. Otherwise the fix looks good to me.
Thanks for the suggestion. I actually thought about this before submission. I was worried about someone back ports the jumbo frame feature, they may not automatically know this patch should be backported too.
The jumbo frame commit (2fbbd712baf1) depends on the MTU commit (2fbbd712baf1), so adding "Fixes: 2fbbd712baf1" ( instead of "Fixes: ca9c54d2d6a5") might make it easier for people to notice and pick up this fix.
I'm OK if the patch remains as is. Just wanted to make sure I understand the issue here.
Please update the tag to where the bug was actually first exposed.
Thank Dexuan and Jakub for the suggestions. I will update the tag.
- Haiyang
linux-stable-mirror@lists.linaro.org