Hi all,
this series contains some improvements for the selftest patches. The other patches remain unchanged. Please check the changelist below.
I have reverted the addition of the NOARP flag from the previous version, as it was not effective and the CI was still failing occasionally because of the race condition caused by foreign packets interfering with the veth tests. This series contains an alternative solution by filtering all but the test packets using the attached XDP program.
Successful pipeline: https://github.com/kernel-patches/bpf/actions/runs/13552017584
---
v4: - strip unrelated changes from the selftest patches - extend commit message for "selftests/bpf: refactor xdp_context_functional test and bpf program" - the NOARP flag was not effective to prevent other packets from interfering with the tests, add a filter to the XDP program instead - run xdp_context_tuntap in a separate namespace to avoid conflicts with other tests
v3: https://lore.kernel.org/bpf/20250224152909.3911544-1-marcus.wichelmann@hetzn... - change the condition to handle xdp_buffs without metadata support, as suggested by Willem de Bruijn willemb@google.com - add clarifying comment why that condition is needed - set NOARP flag in selftests to ensure that the kernel does not send packets on the test interfaces that may interfere with the tests
v2: https://lore.kernel.org/bpf/20250217172308.3291739-1-marcus.wichelmann@hetzn... - submit against bpf-next subtree - split commits and improved commit messages - remove redundant metasize check and add clarifying comment instead - use max() instead of ternary operator - add selftest for metadata support in the tun driver
v1: https://lore.kernel.org/all/20250130171614.1657224-1-marcus.wichelmann@hetzn...
Marcus Wichelmann (6): net: tun: enable XDP metadata support net: tun: enable transfer of XDP metadata to skb selftests/bpf: move open_tuntap to network helpers selftests/bpf: refactor xdp_context_functional test and bpf program selftests/bpf: add test for XDP metadata support in tun driver selftests/bpf: fix file descriptor assertion in open_tuntap helper
drivers/net/tun.c | 28 +++- tools/testing/selftests/bpf/network_helpers.c | 28 ++++ tools/testing/selftests/bpf/network_helpers.h | 3 + .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ---- .../bpf/prog_tests/xdp_context_test_run.c | 138 +++++++++++++++++- .../selftests/bpf/progs/test_xdp_meta.c | 53 +++++-- 6 files changed, 223 insertions(+), 56 deletions(-)
Enable the support for the bpf_xdp_adjust_meta helper function for XDP buffers initialized by the tun driver. This allows to reserve a metadata area that is useful to pass any information from one XDP program to another one, for example when using tail-calls.
Whether this helper function can be used in an XDP program depends on how the xdp_buff was initialized. Most net drivers initialize the xdp_buff in a way, that allows bpf_xdp_adjust_meta to be used. In case of the tun driver, this is currently not the case.
There are two code paths in the tun driver that lead to a bpf_prog_run_xdp and where metadata support should be enabled:
1. tun_build_skb, which is called by tun_get_user and is used when writing packets from userspace into the device. In this case, the xdp_buff created in tun_build_skb has no support for bpf_xdp_adjust_meta and calls of that helper function result in ENOTSUPP.
For this code path, it's sufficient to set the meta_valid argument of the xdp_prepare_buff call. The reserved headroom is large enough already.
2. tun_xdp_one, which is called by tun_sendmsg which again is called by other drivers (e.g. vhost_net). When the TUN_MSG_PTR mode is used, another driver may pass a batch of xdp_buffs to the tun driver. In this case, that other driver is the one initializing the xdp_buff.
See commit 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") for details.
For now, the vhost_net driver is the only one using TUN_MSG_PTR and it already initializes the xdp_buffs with metadata support and sufficient headroom. But the tun driver disables it again, so the xdp_set_data_meta_invalid call has to be removed.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Acked-by: Jason Wang jasowang@redhat.com --- drivers/net/tun.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index acf96f262488..4ec8fbd93c8d 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1709,7 +1709,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, u32 act;
xdp_init_buff(&xdp, buflen, &tfile->xdp_rxq); - xdp_prepare_buff(&xdp, buf, pad, len, false); + xdp_prepare_buff(&xdp, buf, pad, len, true);
act = bpf_prog_run_xdp(xdp_prog, &xdp); if (act == XDP_REDIRECT || act == XDP_TX) { @@ -2467,7 +2467,6 @@ static int tun_xdp_one(struct tun_struct *tun, }
xdp_init_buff(xdp, buflen, &tfile->xdp_rxq); - xdp_set_data_meta_invalid(xdp);
act = bpf_prog_run_xdp(xdp_prog, xdp); ret = tun_xdp_act(tun, xdp_prog, xdp, act);
When the XDP metadata area was used, it is expected that the same metadata can also be accessed from TC, as can be read in the description of the bpf_xdp_adjust_meta helper function. In the tun driver, this was not yet implemented.
To make this work, the skb that is being built on XDP_PASS should know of the current size of the metadata area. This is ensured by adding calls to skb_metadata_set. For the tun_xdp_one code path, an additional check is necessary to handle the case where the externally initialized xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
More information about this feature can be found in the commit message of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com --- drivers/net/tun.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 4ec8fbd93c8d..70208b3a2e93 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1600,7 +1600,8 @@ static bool tun_can_build_skb(struct tun_struct *tun, struct tun_file *tfile,
static struct sk_buff *__tun_build_skb(struct tun_file *tfile, struct page_frag *alloc_frag, char *buf, - int buflen, int len, int pad) + int buflen, int len, int pad, + int metasize) { struct sk_buff *skb = build_skb(buf, buflen);
@@ -1609,6 +1610,8 @@ static struct sk_buff *__tun_build_skb(struct tun_file *tfile,
skb_reserve(skb, pad); skb_put(skb, len); + if (metasize) + skb_metadata_set(skb, metasize); skb_set_owner_w(skb, tfile->socket.sk);
get_page(alloc_frag->page); @@ -1668,6 +1671,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, char *buf; size_t copied; int pad = TUN_RX_PAD; + int metasize = 0; int err = 0;
rcu_read_lock(); @@ -1695,7 +1699,7 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun, if (hdr->gso_type || !xdp_prog) { *skb_xdp = 1; return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, - pad); + pad, metasize); }
*skb_xdp = 0; @@ -1730,12 +1734,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct *tun,
pad = xdp.data - xdp.data_hard_start; len = xdp.data_end - xdp.data; + + /* It is known that the xdp_buff was prepared with metadata + * support, so the metasize will never be negative. + */ + metasize = xdp.data - xdp.data_meta; } bpf_net_ctx_clear(bpf_net_ctx); rcu_read_unlock(); local_bh_enable();
- return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad); + return __tun_build_skb(tfile, alloc_frag, buf, buflen, len, pad, + metasize);
out: bpf_net_ctx_clear(bpf_net_ctx); @@ -2452,6 +2462,7 @@ static int tun_xdp_one(struct tun_struct *tun, struct sk_buff_head *queue; u32 rxhash = 0, act; int buflen = hdr->buflen; + int metasize = 0; int ret = 0; bool skb_xdp = false; struct page *page; @@ -2506,6 +2517,14 @@ static int tun_xdp_one(struct tun_struct *tun, skb_reserve(skb, xdp->data - xdp->data_hard_start); skb_put(skb, xdp->data_end - xdp->data);
+ /* The externally provided xdp_buff may have no metadata support, which + * is marked by xdp->data_meta being xdp->data + 1. This will lead to a + * metasize of -1 and is the reason why the condition checks for > 0. + */ + metasize = xdp->data - xdp->data_meta; + if (metasize > 0) + skb_metadata_set(skb, metasize); + if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) { atomic_long_inc(&tun->rx_frame_errors); kfree_skb(skb);
On 2/27/25 6:23 AM, Marcus Wichelmann wrote:
When the XDP metadata area was used, it is expected that the same metadata can also be accessed from TC, as can be read in the description of the bpf_xdp_adjust_meta helper function. In the tun driver, this was not yet implemented.
To make this work, the skb that is being built on XDP_PASS should know of the current size of the metadata area. This is ensured by adding calls to skb_metadata_set. For the tun_xdp_one code path, an additional check is necessary to handle the case where the externally initialized xdp_buff has no metadata support (xdp->data_meta == xdp->data + 1).
More information about this feature can be found in the commit message of commit de8f3a83b0a0 ("bpf: add meta pointer for direct access").
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de
Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com
drivers/net/tun.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 4ec8fbd93c8d..70208b3a2e93 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c
The changes have conflicts with the commit 2506251e81d1 ("tun: Decouple vnet handling").
It is better to rebase the works onto the bpf-next/net, i.e. the "net" branch instead of the "master" branch.
To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com --- tools/testing/selftests/bpf/network_helpers.c | 28 ++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 3 ++ .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ------------------- 3 files changed, 31 insertions(+), 29 deletions(-)
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 737a952dcf80..e1cfa1b37754 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -565,6 +565,34 @@ void close_netns(struct nstoken *token) free(token); }
+int open_tuntap(const char *dev_name, bool need_mac) +{ + int err = 0; + struct ifreq ifr; + int fd = open("/dev/net/tun", O_RDWR); + + if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) + return -1; + + ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); + strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); + ifr.ifr_name[IFNAMSIZ - 1] = '\0'; + + err = ioctl(fd, TUNSETIFF, &ifr); + if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { + close(fd); + return -1; + } + + err = fcntl(fd, F_SETFL, O_NONBLOCK); + if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { + close(fd); + return -1; + } + + return fd; +} + int get_socket_local_port(int sock_fd) { struct sockaddr_storage addr; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 9f6e05d886c5..99d1417c1d8b 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -8,6 +8,7 @@ typedef __u16 __sum16; #include <linux/if_ether.h> #include <linux/if_packet.h> +#include <linux/if_tun.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/ethtool.h> @@ -98,6 +99,8 @@ int send_recv_data(int lfd, int fd, uint32_t total_bytes); int make_netns(const char *name); int remove_netns(const char *name);
+int open_tuntap(const char *dev_name, bool need_mac); + /** * append_tid() - Append thread ID to the given string. * diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h index fb1eb8c67361..ccec0fcdabc1 100644 --- a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h +++ b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h @@ -5,7 +5,6 @@
#include <time.h> #include <net/if.h> -#include <linux/if_tun.h> #include <linux/icmp.h>
#include "test_progs.h" @@ -37,34 +36,6 @@ static inline int netns_delete(void) return system("ip netns del " NETNS ">/dev/null 2>&1"); }
-static int open_tuntap(const char *dev_name, bool need_mac) -{ - int err = 0; - struct ifreq ifr; - int fd = open("/dev/net/tun", O_RDWR); - - if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) - return -1; - - ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); - strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); - ifr.ifr_name[IFNAMSIZ - 1] = '\0'; - - err = ioctl(fd, TUNSETIFF, &ifr); - if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { - close(fd); - return -1; - } - - err = fcntl(fd, F_SETFL, O_NONBLOCK); - if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { - close(fd); - return -1; - } - - return fd; -} - #define ICMP_PAYLOAD_SIZE 100
/* Match an ICMP packet with payload len ICMP_PAYLOAD_SIZE */
The existing XDP metadata test works by creating a veth pair and attaching XDP & TC programs that drop the packet when the condition of the test isn't fulfilled. The test then pings through the veth pair and succeeds when the ping comes through.
While this test works great for a veth pair, it is hard to replicate for tap devices to test the XDP metadata support of them. A similar test for the tun driver would either involve logic to reply to the ping request, or would have to capture the packet to check if it was dropped or not.
To make the testing of other drivers easier while still maximizing code reuse, this commit refactors the existing xdp_context_functional test to use a test_result map. Instead of conditionally passing or dropping the packet, the TC program is changed to copy the received metadata into the value of that single-entry array map. Tests can then verify that the map value matches the expectation.
This testing logic is easy to adapt to other network drivers as the only remaining requirement is that there is some way to send a custom Ethernet packet through it that triggers the XDP & TC programs.
The Ethernet header of that custom packet is all-zero, because it is not required to be valid for the test to work. The zero ethertype also helps to filter out packets that are not related to the test and would otherwise interfere with it.
The payload of the Ethernet packet is used as the test data that is expected to be passed as metadata from the XDP to the TC program and written to the map. It has a fixed size of 32 bytes which is a reasonable size that should be supported by both drivers. Additional packet headers are not necessary for the test and were therefore skipped to keep the testing code short.
This new testing methodology no longer requires the veth interfaces to have IP addresses assigned, therefore these were removed.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de --- .../bpf/prog_tests/xdp_context_test_run.c | 79 +++++++++++++++++-- .../selftests/bpf/progs/test_xdp_meta.c | 53 +++++++++---- 2 files changed, 110 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c index 937da9b7532a..78ca01edb050 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c @@ -4,13 +4,19 @@ #include "test_xdp_context_test_run.skel.h" #include "test_xdp_meta.skel.h"
-#define TX_ADDR "10.0.0.1" -#define RX_ADDR "10.0.0.2" #define RX_NAME "veth0" #define TX_NAME "veth1" #define TX_NETNS "xdp_context_tx" #define RX_NETNS "xdp_context_rx"
+#define TEST_PAYLOAD_LEN 32 +static const __u8 test_payload[TEST_PAYLOAD_LEN] = { + 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, + 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, + 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28, + 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, +}; + void test_xdp_context_error(int prog_fd, struct bpf_test_run_opts opts, __u32 data_meta, __u32 data, __u32 data_end, __u32 ingress_ifindex, __u32 rx_queue_index, @@ -112,7 +118,59 @@ void test_xdp_context_test_run(void) test_xdp_context_test_run__destroy(skel); }
-void test_xdp_context_functional(void) +static int send_test_packet(int ifindex) +{ + int n, sock = -1; + __u8 packet[sizeof(struct ethhdr) + TEST_PAYLOAD_LEN]; + + /* The ethernet header is not relevant for this test and doesn't need to + * be meaningful. + */ + struct ethhdr eth = { 0 }; + + memcpy(packet, ð, sizeof(eth)); + memcpy(packet + sizeof(eth), test_payload, TEST_PAYLOAD_LEN); + + sock = socket(AF_PACKET, SOCK_RAW, IPPROTO_RAW); + if (!ASSERT_GE(sock, 0, "socket")) + goto err; + + struct sockaddr_ll saddr = { + .sll_family = PF_PACKET, + .sll_ifindex = ifindex, + .sll_halen = ETH_ALEN + }; + n = sendto(sock, packet, sizeof(packet), 0, (struct sockaddr *)&saddr, + sizeof(saddr)); + if (!ASSERT_EQ(n, sizeof(packet), "sendto")) + goto err; + + close(sock); + return 0; + +err: + if (sock >= 0) + close(sock); + return -1; +} + +static void assert_test_result(struct test_xdp_meta *skel) +{ + int err; + __u32 map_key = 0; + __u8 map_value[TEST_PAYLOAD_LEN]; + + err = bpf_map__lookup_elem(skel->maps.test_result, &map_key, + sizeof(map_key), &map_value, + TEST_PAYLOAD_LEN, BPF_ANY); + if (!ASSERT_OK(err, "lookup test_result")) + return; + + ASSERT_MEMEQ(&map_value, &test_payload, TEST_PAYLOAD_LEN, + "test_result map contains test payload"); +} + +void test_xdp_context_veth(void) { LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS); LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1); @@ -120,7 +178,7 @@ void test_xdp_context_functional(void) struct bpf_program *tc_prog, *xdp_prog; struct test_xdp_meta *skel = NULL; struct nstoken *nstoken = NULL; - int rx_ifindex; + int rx_ifindex, tx_ifindex; int ret;
tx_ns = netns_new(TX_NETNS, false); @@ -138,7 +196,6 @@ void test_xdp_context_functional(void) if (!ASSERT_OK_PTR(nstoken, "setns rx_ns")) goto close;
- SYS(close, "ip addr add " RX_ADDR "/24 dev " RX_NAME); SYS(close, "ip link set dev " RX_NAME " up");
skel = test_xdp_meta__open_and_load(); @@ -179,9 +236,17 @@ void test_xdp_context_functional(void) if (!ASSERT_OK_PTR(nstoken, "setns tx_ns")) goto close;
- SYS(close, "ip addr add " TX_ADDR "/24 dev " TX_NAME); SYS(close, "ip link set dev " TX_NAME " up"); - ASSERT_OK(SYS_NOFAIL("ping -c 1 " RX_ADDR), "ping"); + + tx_ifindex = if_nametoindex(TX_NAME); + if (!ASSERT_GE(tx_ifindex, 0, "if_nametoindex tx")) + goto close; + + ret = send_test_packet(tx_ifindex); + if (!ASSERT_OK(ret, "send_test_packet")) + goto close; + + assert_test_result(skel);
close: close_netns(nstoken); diff --git a/tools/testing/selftests/bpf/progs/test_xdp_meta.c b/tools/testing/selftests/bpf/progs/test_xdp_meta.c index fe2d71ae0e71..fcf6ca14f2ea 100644 --- a/tools/testing/selftests/bpf/progs/test_xdp_meta.c +++ b/tools/testing/selftests/bpf/progs/test_xdp_meta.c @@ -4,37 +4,50 @@
#include <bpf/bpf_helpers.h>
-#define __round_mask(x, y) ((__typeof__(x))((y) - 1)) -#define round_up(x, y) ((((x) - 1) | __round_mask(x, y)) + 1) +#define META_SIZE 32 + #define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
+/* Demonstrates how metadata can be passed from an XDP program to a TC program + * using bpf_xdp_adjust_meta. + * For the sake of testing the metadata support in drivers, the XDP program uses + * a fixed-size payload after the Ethernet header as metadata. The TC program + * copies the metadata it receives into a map so it can be checked from + * userspace. + */ + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 1); + __type(key, __u32); + __uint(value_size, META_SIZE); +} test_result SEC(".maps"); + SEC("tc") int ing_cls(struct __sk_buff *ctx) { - __u8 *data, *data_meta, *data_end; - __u32 diff = 0; + __u8 *data, *data_meta; + __u32 key = 0;
data_meta = ctx_ptr(ctx, data_meta); - data_end = ctx_ptr(ctx, data_end); data = ctx_ptr(ctx, data);
- if (data + ETH_ALEN > data_end || - data_meta + round_up(ETH_ALEN, 4) > data) + if (data_meta + META_SIZE > data) return TC_ACT_SHOT;
- diff |= ((__u32 *)data_meta)[0] ^ ((__u32 *)data)[0]; - diff |= ((__u16 *)data_meta)[2] ^ ((__u16 *)data)[2]; + bpf_map_update_elem(&test_result, &key, data_meta, BPF_ANY);
- return diff ? TC_ACT_SHOT : TC_ACT_OK; + return TC_ACT_SHOT; }
SEC("xdp") int ing_xdp(struct xdp_md *ctx) { - __u8 *data, *data_meta, *data_end; + __u8 *data, *data_meta, *data_end, *payload; + struct ethhdr *eth; int ret;
- ret = bpf_xdp_adjust_meta(ctx, -round_up(ETH_ALEN, 4)); + ret = bpf_xdp_adjust_meta(ctx, -META_SIZE); if (ret < 0) return XDP_DROP;
@@ -42,11 +55,21 @@ int ing_xdp(struct xdp_md *ctx) data_end = ctx_ptr(ctx, data_end); data = ctx_ptr(ctx, data);
- if (data + ETH_ALEN > data_end || - data_meta + round_up(ETH_ALEN, 4) > data) + eth = (struct ethhdr *)data; + payload = data + sizeof(struct ethhdr); + + if (payload + META_SIZE > data_end || + data_meta + META_SIZE > data) + return XDP_DROP; + + /* The Linux networking stack may send other packets on the test + * interface that interfere with the test. Just drop them. + * The test packets can be recognized by their ethertype of zero. + */ + if (eth->h_proto != 0) return XDP_DROP;
- __builtin_memcpy(data_meta, data, ETH_ALEN); + __builtin_memcpy(data_meta, payload, META_SIZE); return XDP_PASS; }
Marcus Wichelmann wrote:
The existing XDP metadata test works by creating a veth pair and attaching XDP & TC programs that drop the packet when the condition of the test isn't fulfilled. The test then pings through the veth pair and succeeds when the ping comes through.
While this test works great for a veth pair, it is hard to replicate for tap devices to test the XDP metadata support of them. A similar test for the tun driver would either involve logic to reply to the ping request, or would have to capture the packet to check if it was dropped or not.
To make the testing of other drivers easier while still maximizing code reuse, this commit refactors the existing xdp_context_functional test to use a test_result map. Instead of conditionally passing or dropping the packet, the TC program is changed to copy the received metadata into the value of that single-entry array map. Tests can then verify that the map value matches the expectation.
This testing logic is easy to adapt to other network drivers as the only remaining requirement is that there is some way to send a custom Ethernet packet through it that triggers the XDP & TC programs.
The Ethernet header of that custom packet is all-zero, because it is not required to be valid for the test to work. The zero ethertype also helps to filter out packets that are not related to the test and would otherwise interfere with it.
The payload of the Ethernet packet is used as the test data that is expected to be passed as metadata from the XDP to the TC program and written to the map. It has a fixed size of 32 bytes which is a reasonable size that should be supported by both drivers. Additional packet headers are not necessary for the test and were therefore skipped to keep the testing code short.
This new testing methodology no longer requires the veth interfaces to have IP addresses assigned, therefore these were removed.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de
Reviewed-by: Willem de Bruijn willemb@google.com
Add a selftest that creates a tap device, attaches XDP and TC programs, writes a packet with a test payload into the tap device and checks the test result. This test ensures that the XDP metadata support in the tun driver is enabled and that the metadata size is correctly passed to the skb.
See the previous commit ("selftests/bpf: refactor xdp_context_functional test and bpf program") for details about the test design.
The test runs in its own network namespace using the feature introduced with commit c047e0e0e435 ("selftests/bpf: Optionally open a dedicated namespace to run test in it"). This provides some extra safety against conflicting interface names.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de --- .../bpf/prog_tests/xdp_context_test_run.c | 59 +++++++++++++++++++ 1 file changed, 59 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c index 78ca01edb050..ce3e82cc69d1 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_context_test_run.c @@ -8,6 +8,7 @@ #define TX_NAME "veth1" #define TX_NETNS "xdp_context_tx" #define RX_NETNS "xdp_context_rx" +#define TAP_NAME "tap0"
#define TEST_PAYLOAD_LEN 32 static const __u8 test_payload[TEST_PAYLOAD_LEN] = { @@ -255,3 +256,61 @@ void test_xdp_context_veth(void) netns_free(tx_ns); }
+void test_ns_xdp_context_tuntap(void) +{ + LIBBPF_OPTS(bpf_tc_hook, tc_hook, .attach_point = BPF_TC_INGRESS); + LIBBPF_OPTS(bpf_tc_opts, tc_opts, .handle = 1, .priority = 1); + struct test_xdp_meta *skel = NULL; + __u8 packet[sizeof(struct ethhdr) + TEST_PAYLOAD_LEN]; + int tap_fd = -1; + int tap_ifindex; + int ret; + + tap_fd = open_tuntap(TAP_NAME, true); + if (!ASSERT_GE(tap_fd, 0, "open_tuntap")) + goto close; + + SYS(close, "ip link set dev " TAP_NAME " up"); + + skel = test_xdp_meta__open_and_load(); + if (!ASSERT_OK_PTR(skel, "open and load skeleton")) + goto close; + + tap_ifindex = if_nametoindex(TAP_NAME); + if (!ASSERT_GE(tap_ifindex, 0, "if_nametoindex")) + goto close; + + tc_hook.ifindex = tap_ifindex; + ret = bpf_tc_hook_create(&tc_hook); + if (!ASSERT_OK(ret, "bpf_tc_hook_create")) + goto close; + + tc_opts.prog_fd = bpf_program__fd(skel->progs.ing_cls); + ret = bpf_tc_attach(&tc_hook, &tc_opts); + if (!ASSERT_OK(ret, "bpf_tc_attach")) + goto close; + + ret = bpf_xdp_attach(tap_ifindex, bpf_program__fd(skel->progs.ing_xdp), + 0, NULL); + if (!ASSERT_GE(ret, 0, "bpf_xdp_attach")) + goto close; + + /* The ethernet header is not relevant for this test and doesn't need to + * be meaningful. + */ + struct ethhdr eth = { 0 }; + + memcpy(packet, ð, sizeof(eth)); + memcpy(packet + sizeof(eth), test_payload, TEST_PAYLOAD_LEN); + + ret = write(tap_fd, packet, sizeof(packet)); + if (!ASSERT_EQ(ret, sizeof(packet), "write packet")) + goto close; + + assert_test_result(skel); + +close: + if (tap_fd >= 0) + close(tap_fd); + test_xdp_meta__destroy(skel); +}
On 02/27, Marcus Wichelmann wrote:
Add a selftest that creates a tap device, attaches XDP and TC programs, writes a packet with a test payload into the tap device and checks the test result. This test ensures that the XDP metadata support in the tun driver is enabled and that the metadata size is correctly passed to the skb.
See the previous commit ("selftests/bpf: refactor xdp_context_functional test and bpf program") for details about the test design.
The test runs in its own network namespace using the feature introduced with commit c047e0e0e435 ("selftests/bpf: Optionally open a dedicated namespace to run test in it"). This provides some extra safety against conflicting interface names.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de
Acked-by: Stanislav Fomichev sdf@fomichev.me
The open_tuntap helper function uses open() to get a file descriptor for /dev/net/tun.
The open(2) manpage writes this about its return value:
On success, open(), openat(), and creat() return the new file descriptor (a nonnegative integer). On error, -1 is returned and errno is set to indicate the error.
This means that the fd > 0 assertion in the open_tuntap helper is incorrect and should rather check for fd >= 0.
When running the BPF selftests locally, this incorrect assertion was not an issue, but the BPF kernel-patches CI failed because of this:
open_tuntap:FAIL:open(/dev/net/tun) unexpected open(/dev/net/tun): actual 0 <= expected 0
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Reviewed-by: Willem de Bruijn willemb@google.com --- tools/testing/selftests/bpf/network_helpers.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index e1cfa1b37754..9b59bfd5d912 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -571,7 +571,7 @@ int open_tuntap(const char *dev_name, bool need_mac) struct ifreq ifr; int fd = open("/dev/net/tun", O_RDWR);
- if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) + if (!ASSERT_GE(fd, 0, "open(/dev/net/tun)")) return -1;
ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN);
Hi Marcus
Since your patches are about the virtual network, I'd like to test it, but it conflicts (Please review the attachment to review more details) when I apply it to the master branch. My test based on this commit: commit 1e15510b71c99c6e49134d756df91069f7d18141 (origin/master, origin/HEAD) Merge: f09d694cf799 54e1b4becf5e Author: Linus Torvalds torvalds@linux-foundation.org Date: Thu Feb 27 09:32:42 2025 -0800
Merge tag 'net-6.14-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth.
Thanks Lei
On Thu, Feb 27, 2025 at 10:28 PM Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de wrote:
Hi all,
this series contains some improvements for the selftest patches. The other patches remain unchanged. Please check the changelist below.
I have reverted the addition of the NOARP flag from the previous version, as it was not effective and the CI was still failing occasionally because of the race condition caused by foreign packets interfering with the veth tests. This series contains an alternative solution by filtering all but the test packets using the attached XDP program.
Successful pipeline: https://github.com/kernel-patches/bpf/actions/runs/13552017584
v4:
- strip unrelated changes from the selftest patches
- extend commit message for "selftests/bpf: refactor xdp_context_functional test and bpf program"
- the NOARP flag was not effective to prevent other packets from interfering with the tests, add a filter to the XDP program instead
- run xdp_context_tuntap in a separate namespace to avoid conflicts with other tests
v3: https://lore.kernel.org/bpf/20250224152909.3911544-1-marcus.wichelmann@hetzn...
- change the condition to handle xdp_buffs without metadata support, as suggested by Willem de Bruijn willemb@google.com
- add clarifying comment why that condition is needed
- set NOARP flag in selftests to ensure that the kernel does not send packets on the test interfaces that may interfere with the tests
v2: https://lore.kernel.org/bpf/20250217172308.3291739-1-marcus.wichelmann@hetzn...
- submit against bpf-next subtree
- split commits and improved commit messages
- remove redundant metasize check and add clarifying comment instead
- use max() instead of ternary operator
- add selftest for metadata support in the tun driver
v1: https://lore.kernel.org/all/20250130171614.1657224-1-marcus.wichelmann@hetzn...
Marcus Wichelmann (6): net: tun: enable XDP metadata support net: tun: enable transfer of XDP metadata to skb selftests/bpf: move open_tuntap to network helpers selftests/bpf: refactor xdp_context_functional test and bpf program selftests/bpf: add test for XDP metadata support in tun driver selftests/bpf: fix file descriptor assertion in open_tuntap helper
drivers/net/tun.c | 28 +++- tools/testing/selftests/bpf/network_helpers.c | 28 ++++ tools/testing/selftests/bpf/network_helpers.h | 3 + .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ---- .../bpf/prog_tests/xdp_context_test_run.c | 138 +++++++++++++++++- .../selftests/bpf/progs/test_xdp_meta.c | 53 +++++-- 6 files changed, 223 insertions(+), 56 deletions(-)
-- 2.43.0
Am 28.02.25 um 06:43 schrieb Lei Yang:
Hi Marcus
Since your patches are about the virtual network, I'd like to test it, but it conflicts (Please review the attachment to review more details) when I apply it to the master branch. My test based on this commit: commit 1e15510b71c99c6e49134d756df91069f7d18141 (origin/master, origin/HEAD) Merge: f09d694cf799 54e1b4becf5e Author: Linus Torvalds torvalds@linux-foundation.org Date: Thu Feb 27 09:32:42 2025 -0800
Merge tag 'net-6.14-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth.
Hi,
thank you for including it in your tests.
The mentioned commit is not yet in bpf-next, but I rebased my patch on latest mainline and attached the updated patch for the conflicting commit.
The conflict was just about a section that was removed right below my added line in network_helpers.h, so no functional change.
---
From c7182e5a4d21696b9e8cd25f92e64e28129e2c6e Mon Sep 17 00:00:00 2001 From: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Date: Thu, 13 Feb 2025 16:28:19 +0000 Subject: [PATCH] selftests/bpf: move open_tuntap to network helpers
To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com --- tools/testing/selftests/bpf/network_helpers.c | 28 ++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 3 ++ .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ------------------- 3 files changed, 31 insertions(+), 29 deletions(-)
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 80844a5fb1fe..fcee2c4a637a 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -548,6 +548,34 @@ void close_netns(struct nstoken *token) free(token); }
+int open_tuntap(const char *dev_name, bool need_mac) +{ + int err = 0; + struct ifreq ifr; + int fd = open("/dev/net/tun", O_RDWR); + + if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) + return -1; + + ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); + strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); + ifr.ifr_name[IFNAMSIZ - 1] = '\0'; + + err = ioctl(fd, TUNSETIFF, &ifr); + if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { + close(fd); + return -1; + } + + err = fcntl(fd, F_SETFL, O_NONBLOCK); + if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { + close(fd); + return -1; + } + + return fd; +} + int get_socket_local_port(int sock_fd) { struct sockaddr_storage addr; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index ebec8a8d6f81..175dd547849f 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -8,6 +8,7 @@ typedef __u16 __sum16; #include <linux/if_ether.h> #include <linux/if_packet.h> +#include <linux/if_tun.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/ethtool.h> @@ -98,6 +99,8 @@ int send_recv_data(int lfd, int fd, uint32_t total_bytes); int make_netns(const char *name); int remove_netns(const char *name);
+int open_tuntap(const char *dev_name, bool need_mac); + static __u16 csum_fold(__u32 csum) { csum = (csum & 0xffff) + (csum >> 16); diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h index fb1eb8c67361..ccec0fcdabc1 100644 --- a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h +++ b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h @@ -5,7 +5,6 @@
#include <time.h> #include <net/if.h> -#include <linux/if_tun.h> #include <linux/icmp.h>
#include "test_progs.h" @@ -37,34 +36,6 @@ static inline int netns_delete(void) return system("ip netns del " NETNS ">/dev/null 2>&1"); }
-static int open_tuntap(const char *dev_name, bool need_mac) -{ - int err = 0; - struct ifreq ifr; - int fd = open("/dev/net/tun", O_RDWR); - - if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) - return -1; - - ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); - strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); - ifr.ifr_name[IFNAMSIZ - 1] = '\0'; - - err = ioctl(fd, TUNSETIFF, &ifr); - if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { - close(fd); - return -1; - } - - err = fcntl(fd, F_SETFL, O_NONBLOCK); - if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { - close(fd); - return -1; - } - - return fd; -} - #define ICMP_PAYLOAD_SIZE 100
/* Match an ICMP packet with payload len ICMP_PAYLOAD_SIZE */
Am 28.02.25 um 06:43 schrieb Lei Yang:
Hi Marcus
Since your patches are about the virtual network, I'd like to test it, but it conflicts (Please review the attachment to review more details) when I apply it to the master branch. My test based on this commit: commit 1e15510b71c99c6e49134d756df91069f7d18141 (origin/master, origin/HEAD) Merge: f09d694cf799 54e1b4becf5e Author: Linus Torvalds torvalds@linux-foundation.org Date: Thu Feb 27 09:32:42 2025 -0800
Merge tag 'net-6.14-rc5' of
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski: "Including fixes from bluetooth.
Hi,
thank you for including it in your tests.
The mentioned commit is not yet in bpf-next, but I rebased my patch on latest mainline and attached the updated patch for the conflicting commit (now with proper formatting, hopefully).
The conflict was just about a section that was removed right below my added line in network_helpers.h, so no functional change.
---
From c7182e5a4d21696b9e8cd25f92e64e28129e2c6e Mon Sep 17 00:00:00 2001 From: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Date: Thu, 13 Feb 2025 16:28:19 +0000 Subject: [PATCH] selftests/bpf: move open_tuntap to network helpers
To test the XDP metadata functionality of the tun driver, it's necessary to create a new tap device first. A helper function for this already exists in lwt_helpers.h. Move it to the common network helpers header, so it can be reused in other tests.
Signed-off-by: Marcus Wichelmann marcus.wichelmann@hetzner-cloud.de Reviewed-by: Willem de Bruijn willemb@google.com Acked-by: Jason Wang jasowang@redhat.com --- tools/testing/selftests/bpf/network_helpers.c | 28 ++++++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 3 ++ .../selftests/bpf/prog_tests/lwt_helpers.h | 29 ------------------- 3 files changed, 31 insertions(+), 29 deletions(-)
diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 80844a5fb1fe..fcee2c4a637a 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -548,6 +548,34 @@ void close_netns(struct nstoken *token) free(token); }
+int open_tuntap(const char *dev_name, bool need_mac) +{ + int err = 0; + struct ifreq ifr; + int fd = open("/dev/net/tun", O_RDWR); + + if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) + return -1; + + ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); + strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); + ifr.ifr_name[IFNAMSIZ - 1] = '\0'; + + err = ioctl(fd, TUNSETIFF, &ifr); + if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { + close(fd); + return -1; + } + + err = fcntl(fd, F_SETFL, O_NONBLOCK); + if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { + close(fd); + return -1; + } + + return fd; +} + int get_socket_local_port(int sock_fd) { struct sockaddr_storage addr; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index ebec8a8d6f81..175dd547849f 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -8,6 +8,7 @@ typedef __u16 __sum16; #include <linux/if_ether.h> #include <linux/if_packet.h> +#include <linux/if_tun.h> #include <linux/ip.h> #include <linux/ipv6.h> #include <linux/ethtool.h> @@ -98,6 +99,8 @@ int send_recv_data(int lfd, int fd, uint32_t total_bytes); int make_netns(const char *name); int remove_netns(const char *name);
+int open_tuntap(const char *dev_name, bool need_mac); + static __u16 csum_fold(__u32 csum) { csum = (csum & 0xffff) + (csum >> 16); diff --git a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h index fb1eb8c67361..ccec0fcdabc1 100644 --- a/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h +++ b/tools/testing/selftests/bpf/prog_tests/lwt_helpers.h @@ -5,7 +5,6 @@
#include <time.h> #include <net/if.h> -#include <linux/if_tun.h> #include <linux/icmp.h>
#include "test_progs.h" @@ -37,34 +36,6 @@ static inline int netns_delete(void) return system("ip netns del " NETNS ">/dev/null 2>&1"); }
-static int open_tuntap(const char *dev_name, bool need_mac) -{ - int err = 0; - struct ifreq ifr; - int fd = open("/dev/net/tun", O_RDWR); - - if (!ASSERT_GT(fd, 0, "open(/dev/net/tun)")) - return -1; - - ifr.ifr_flags = IFF_NO_PI | (need_mac ? IFF_TAP : IFF_TUN); - strncpy(ifr.ifr_name, dev_name, IFNAMSIZ - 1); - ifr.ifr_name[IFNAMSIZ - 1] = '\0'; - - err = ioctl(fd, TUNSETIFF, &ifr); - if (!ASSERT_OK(err, "ioctl(TUNSETIFF)")) { - close(fd); - return -1; - } - - err = fcntl(fd, F_SETFL, O_NONBLOCK); - if (!ASSERT_OK(err, "fcntl(O_NONBLOCK)")) { - close(fd); - return -1; - } - - return fd; -} - #define ICMP_PAYLOAD_SIZE 100
/* Match an ICMP packet with payload len ICMP_PAYLOAD_SIZE */
linux-kselftest-mirror@lists.linaro.org