In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com --- drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS; rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog); - /* TODO: support multi buffer. */ - if (prog && num_buf == 1) - ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats); + if (prog) { + /* TODO: support multi buffer. */ + if (num_buf == 1) + ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, + stats); + else + ret = XDP_DROP; + } rcu_read_unlock();
switch (ret) {
On Tue, Jun 3, 2025 at 11:07 PM Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS;
It would be simpler to just assign XDP_DROP here?
Or if you wish to stick to the way, we can simply remove this assignment.
rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
/* TODO: support multi buffer. */
if (prog && num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
if (prog) {
/* TODO: support multi buffer. */
if (num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
stats);
else
ret = XDP_DROP;
} rcu_read_unlock(); switch (ret) {
-- 2.43.0
Thanks
On 6/4/25 07:37, Jason Wang wrote:
On Tue, Jun 3, 2025 at 11:07 PM Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS;
It would be simpler to just assign XDP_DROP here?
Or if you wish to stick to the way, we can simply remove this assignment.
This XDP_PASS is returned for the case when there is no XDP program binding (!prog).
rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
/* TODO: support multi buffer. */
if (prog && num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
if (prog) {
/* TODO: support multi buffer. */
if (num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
stats);
else
ret = XDP_DROP;
} rcu_read_unlock(); switch (ret) {
-- 2.43.0
Thanks
Thanks, Quang Minh.
On Wed, Jun 4, 2025 at 10:17 PM Bui Quang Minh minhquangbui99@gmail.com wrote:
On 6/4/25 07:37, Jason Wang wrote:
On Tue, Jun 3, 2025 at 11:07 PM Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS;
It would be simpler to just assign XDP_DROP here?
Or if you wish to stick to the way, we can simply remove this assignment.
This XDP_PASS is returned for the case when there is no XDP program binding (!prog).
You're right.
Acked-by: Jason Wang jasowang@redhat.com
Thanks
rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
/* TODO: support multi buffer. */
if (prog && num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
if (prog) {
/* TODO: support multi buffer. */
if (num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
stats);
else
ret = XDP_DROP;
} rcu_read_unlock(); switch (ret) {
-- 2.43.0
Thanks
Thanks, Quang Minh.
On Tue, Jun 3, 2025 at 8:09 AM Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Does it make more sense to return XDP_ABORTED? This seems like an unexpected exception case to me, but I'm not familiar enough with virtio-net's multibuffer support.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS; rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
- /* TODO: support multi buffer. */
- if (prog && num_buf == 1)
- ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
- if (prog) {
- /* TODO: support multi buffer. */
- if (num_buf == 1)
- ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
- stats);
- else
- ret = XDP_DROP;
- }
rcu_read_unlock();
switch (ret) {
2.43.0
On 6/4/25 23:55, Zvi Effron wrote:
On Tue, Jun 3, 2025 at 8:09 AM Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Does it make more sense to return XDP_ABORTED? This seems like an unexpected exception case to me, but I'm not familiar enough with virtio-net's multibuffer support.
The following code after this treats XDP_DROP and XDP_ABORTED in the same way. I don't have strong opinion between these 2 values here. We may add a call to trace_xdp_exception in case we want XDP_ABORTED here.
Thanks, Quang Minh.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS; rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
- /* TODO: support multi buffer. */
- if (prog && num_buf == 1)
- ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
- if (prog) {
- /* TODO: support multi buffer. */
- if (num_buf == 1)
- ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
- stats);
- else
- ret = XDP_DROP;
- }
rcu_read_unlock();
switch (ret) {
2.43.0
On 6/3/25 5:06 PM, Bui Quang Minh wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior.
Why? AFAICS the multi-buffer mode depends on features negotiation, which is not controlled by the VM user.
Let's suppose the user wants to attach an XDP program to do some per packet stats accounting. That suddenly would cause drop packets depending on conditions not controlled by the (guest) user. It looks wrong to me.
XDP_ABORTED looks like a better choice.
/P
On 6/5/25 18:03, Paolo Abeni wrote:
On 6/3/25 5:06 PM, Bui Quang Minh wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior.
Why? AFAICS the multi-buffer mode depends on features negotiation, which is not controlled by the VM user.
Let's suppose the user wants to attach an XDP program to do some per packet stats accounting. That suddenly would cause drop packets depending on conditions not controlled by the (guest) user. It looks wrong to me.
But currently, if a multi-buffer packet arrives, it will not go through XDP program so it doesn't increase the stats but still goes to network stack. So I think it's not a correct behavior.
XDP_ABORTED looks like a better choice.
/P
Thanks, Quang Minh.
On Thu, 5 Jun 2025 21:33:26 +0700 Bui Quang Minh wrote:
On 6/5/25 18:03, Paolo Abeni wrote:
On 6/3/25 5:06 PM, Bui Quang Minh wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior.
Why? AFAICS the multi-buffer mode depends on features negotiation, which is not controlled by the VM user.
Let's suppose the user wants to attach an XDP program to do some per packet stats accounting. That suddenly would cause drop packets depending on conditions not controlled by the (guest) user. It looks wrong to me.
But currently, if a multi-buffer packet arrives, it will not go through XDP program so it doesn't increase the stats but still goes to network stack. So I think it's not a correct behavior.
Sounds fair, but at a glance the normal XDP path seems to be trying to linearize the frame. Can we not try to flatten the frame here? If it's simply to long for the chunk size that's a frame length error, right?
On 6/5/25 21:48, Jakub Kicinski wrote:
On Thu, 5 Jun 2025 21:33:26 +0700 Bui Quang Minh wrote:
On 6/5/25 18:03, Paolo Abeni wrote:
On 6/3/25 5:06 PM, Bui Quang Minh wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior.
Why? AFAICS the multi-buffer mode depends on features negotiation, which is not controlled by the VM user.
Let's suppose the user wants to attach an XDP program to do some per packet stats accounting. That suddenly would cause drop packets depending on conditions not controlled by the (guest) user. It looks wrong to me.
But currently, if a multi-buffer packet arrives, it will not go through XDP program so it doesn't increase the stats but still goes to network stack. So I think it's not a correct behavior.
Sounds fair, but at a glance the normal XDP path seems to be trying to linearize the frame. Can we not try to flatten the frame here? If it's simply to long for the chunk size that's a frame length error, right?
Here we are in the zerocopy path, so the buffers for the frame to fill in are allocated from XDP socket's umem. And if the frame spans across multiple buffers then the total frame size is larger than the chunk size. Furthermore, we are in the zerocopy so we cannot linearize by allocating a large enough buffer to cover the whole frame then copy the frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy receive has assumption that the packet it receives must from the umem pool. AFAIK, the generic XDP path is for copy mode only.
Thanks, Quang Minh.
On Fri, 6 Jun 2025 22:48:53 +0700 Bui Quang Minh wrote:
But currently, if a multi-buffer packet arrives, it will not go through XDP program so it doesn't increase the stats but still goes to network stack. So I think it's not a correct behavior.
Sounds fair, but at a glance the normal XDP path seems to be trying to linearize the frame. Can we not try to flatten the frame here? If it's simply to long for the chunk size that's a frame length error, right?
Here we are in the zerocopy path, so the buffers for the frame to fill in are allocated from XDP socket's umem. And if the frame spans across multiple buffers then the total frame size is larger than the chunk size.
Is that always the case? Can the multi-buf not be due to header-data split of the incoming frame? (I'm not familiar with the virtio spec)
Furthermore, we are in the zerocopy so we cannot linearize by allocating a large enough buffer to cover the whole frame then copy the frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy receive has assumption that the packet it receives must from the umem pool. AFAIK, the generic XDP path is for copy mode only.
Generic XDP == do_xdp_generic(), here I think you mean the normal XDP patch in the virtio driver? If so then no, XDP is very much not expected to copy each frame before processing.
This is only slightly related to you patch but while we talk about multi-buf - in the netdev CI the test which sends ping while XDP multi-buf program is attached is really flaky :( https://netdev.bots.linux.dev/contest.html?executor=vmksft-drv-hw&test=p...
On 6/9/25 23:58, Jakub Kicinski wrote:
On Fri, 6 Jun 2025 22:48:53 +0700 Bui Quang Minh wrote:
But currently, if a multi-buffer packet arrives, it will not go through XDP program so it doesn't increase the stats but still goes to network stack. So I think it's not a correct behavior.
Sounds fair, but at a glance the normal XDP path seems to be trying to linearize the frame. Can we not try to flatten the frame here? If it's simply to long for the chunk size that's a frame length error, right?
Here we are in the zerocopy path, so the buffers for the frame to fill in are allocated from XDP socket's umem. And if the frame spans across multiple buffers then the total frame size is larger than the chunk size.
Is that always the case? Can the multi-buf not be due to header-data split of the incoming frame? (I'm not familiar with the virtio spec)
Ah, maybe I cause some confusion :) zerocopy here means zerocopy if the frame is redirected to XDP socket. In this zerocopy mode, XDP socket will provide buffers to virtio-net, the frame from vhost will be placed in those buffers. If the bind XDP program in virtio-net returns XDP_REDIRECT to that XDP socket, then the frame is zerocopy. In case XDP_PASS is returned, the frame's data is copied to newly created skb and the frame's buffer is returned to XDP socket. AFAIK, virtio-net has not supported header-data split yet.
Furthermore, we are in the zerocopy so we cannot linearize by allocating a large enough buffer to cover the whole frame then copy the frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy receive has assumption that the packet it receives must from the umem pool. AFAIK, the generic XDP path is for copy mode only.
Generic XDP == do_xdp_generic(), here I think you mean the normal XDP patch in the virtio driver? If so then no, XDP is very much not expected to copy each frame before processing.
Yes, I mean generic XDP = do_xdp_generic(). I mean that we can linearize the frame if needed (like in netif_skb_check_for_xdp()) in copy mode for XDP socket but not in zerocopy mode.
This is only slightly related to you patch but while we talk about multi-buf - in the netdev CI the test which sends ping while XDP multi-buf program is attached is really flaky :( https://netdev.bots.linux.dev/contest.html?executor=vmksft-drv-hw&test=p...
metal-drv-hw means the NETIF is the real NIC, right?
Thanks, Quang Minh.
On Tue, 10 Jun 2025 22:18:32 +0700 Bui Quang Minh wrote:
Furthermore, we are in the zerocopy so we cannot linearize by allocating a large enough buffer to cover the whole frame then copy the frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy receive has assumption that the packet it receives must from the umem pool. AFAIK, the generic XDP path is for copy mode only.
Generic XDP == do_xdp_generic(), here I think you mean the normal XDP patch in the virtio driver? If so then no, XDP is very much not expected to copy each frame before processing.
Yes, I mean generic XDP = do_xdp_generic(). I mean that we can linearize the frame if needed (like in netif_skb_check_for_xdp()) in copy mode for XDP socket but not in zerocopy mode.
Okay, I meant the copies in the driver - virtio calls xdp_linearize_page() in a few places, for normal XDP.
This is only slightly related to you patch but while we talk about multi-buf - in the netdev CI the test which sends ping while XDP multi-buf program is attached is really flaky :( https://netdev.bots.linux.dev/contest.html?executor=vmksft-drv-hw&test=p...
metal-drv-hw means the NETIF is the real NIC, right?
The "metal" in the name refers to the AWS instance type that hosts the runner. The test runs in a VM over virtio, more details: https://github.com/linux-netdev/nipa/wiki/Running-driver-tests-on-virtio
On 6/11/25 03:37, Jakub Kicinski wrote:
On Tue, 10 Jun 2025 22:18:32 +0700 Bui Quang Minh wrote:
Furthermore, we are in the zerocopy so we cannot linearize by allocating a large enough buffer to cover the whole frame then copy the frame data to it. That's not zerocopy anymore. Also, XDP socket zerocopy receive has assumption that the packet it receives must from the umem pool. AFAIK, the generic XDP path is for copy mode only.
Generic XDP == do_xdp_generic(), here I think you mean the normal XDP patch in the virtio driver? If so then no, XDP is very much not expected to copy each frame before processing.
Yes, I mean generic XDP = do_xdp_generic(). I mean that we can linearize the frame if needed (like in netif_skb_check_for_xdp()) in copy mode for XDP socket but not in zerocopy mode.
Okay, I meant the copies in the driver - virtio calls xdp_linearize_page() in a few places, for normal XDP.
This is only slightly related to you patch but while we talk about multi-buf - in the netdev CI the test which sends ping while XDP multi-buf program is attached is really flaky :( https://netdev.bots.linux.dev/contest.html?executor=vmksft-drv-hw&test=p...
metal-drv-hw means the NETIF is the real NIC, right?
The "metal" in the name refers to the AWS instance type that hosts the runner. The test runs in a VM over virtio, more details: https://github.com/linux-netdev/nipa/wiki/Running-driver-tests-on-virtio
I've figured out the problem. When the test fails, in mergeable_xdp_get_buf
xdp_room = SKB_DATA_ALIGN(XDP_PACKET_HEADROOM + sizeof(struct skb_shared_info)); if (*len + xdp_room > PAGE_SIZE) return NULL;
*len + xdp_room > PAGE_SIZE and NULL is returned, so the packet is dropped. This case happens when add_recvbuf_mergeable is called when XDP program is not loaded, so it does not reserve space for XDP_PACKET_HEADROOM and struct skb_shared_info. But when the vhost uses that buffer and send back to virtio-net, XDP program is loaded. The code has the assumption that XDP frag cannot exceed PAGE_SIZE which I think is not correct anymore. Due to that assumption, when the frame data + XDP_PACKET_HEADROOM + sizeof(struct skb_shared_info) > PAGE_SIZE, the code does not build xdp_buff but drops the frame. xdp_linearize_page has the same assumption. As I don't think the assumption is correct anymore, the fix might be allocating a big enough buffer to build xdp_buff.
Thanks, Quang Minh.
On Tue, 3 Jun 2025 22:06:13 +0700, Bui Quang Minh minhquangbui99@gmail.com wrote:
In virtio-net, we have not yet supported multi-buffer XDP packet in zerocopy mode when there is a binding XDP program. However, in that case, when receiving multi-buffer XDP packet, we skip the XDP program and return XDP_PASS. As a result, the packet is passed to normal network stack which is an incorrect behavior. This commit instead returns XDP_DROP in that case.
Fixes: 99c861b44eb1 ("virtio_net: xsk: rx: support recv merge mode") Cc: stable@vger.kernel.org Signed-off-by: Bui Quang Minh minhquangbui99@gmail.com
Reviewed-by: Xuan Zhuo xuanzhuo@linux.alibaba.com
drivers/net/virtio_net.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index e53ba600605a..4c35324d6e5b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1309,9 +1309,14 @@ static struct sk_buff *virtnet_receive_xsk_merge(struct net_device *dev, struct ret = XDP_PASS; rcu_read_lock(); prog = rcu_dereference(rq->xdp_prog);
- /* TODO: support multi buffer. */
- if (prog && num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit, stats);
if (prog) {
/* TODO: support multi buffer. */
if (num_buf == 1)
ret = virtnet_xdp_handler(prog, xdp, dev, xdp_xmit,
stats);
else
ret = XDP_DROP;
} rcu_read_unlock();
switch (ret) {
-- 2.43.0
linux-stable-mirror@lists.linaro.org