This patch set improves the documentation and selftests for XDP Rx metadata handling. The first patch clarifies the documentation around XDP metadata layout and the use of bpf_xdp_adjust_meta. The second patch enhances the BPF selftests to make XDP metadata handling more robust and portable across different NICs.
Prior to this patch set, the user application retrieved the xdp_meta by calculating backward from the data pointer, while the XDP program fill in the xdp_meta by calculating backward from data_meta. This approach will cause mismatch if there is device-reserved metadata.
|<---sizeof(xdp_meta)--| | | struct xdp_meta rx_desc->address ^ ^ | | +----------+----------------------+------------+------+ | headroom | custom metadata | reserved | data | +----------+----------------------+------------+------+ ^ ^ ^ | | | struct xdp_meta xdp_buff->data_meta xdp_buff->data | | |<---sizeof(xdp_meta)--|
Song Yoong Siang (2): doc: clarify XDP Rx metadata layout and bpf_xdp_adjust_meta usage selftests/bpf: Enhance XDP Rx Metadata Handling
Documentation/networking/xdp-rx-metadata.rst | 38 +++++++++++++++++++ .../selftests/bpf/prog_tests/xdp_metadata.c | 2 +- .../selftests/bpf/progs/xdp_hw_metadata.c | 10 ++++- .../selftests/bpf/progs/xdp_metadata.c | 8 +++- tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_metadata.h | 7 ++++ 6 files changed, 63 insertions(+), 4 deletions(-)
Expand the explanation of how METADATA_SIZE should be chosen to accommodate both device-reserved and custom metadata. Additionally, add a diagram to illustrate the calculation of the delta parameter for bpf_xdp_adjust_meta, including alignment and size constraints.
These changes help users correctly allocate and access metadata in AF_XDP use cases.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com --- Documentation/networking/xdp-rx-metadata.rst | 38 ++++++++++++++++++++ 1 file changed, 38 insertions(+)
diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst index a6e0ece18be5..61418f533e0e 100644 --- a/Documentation/networking/xdp-rx-metadata.rst +++ b/Documentation/networking/xdp-rx-metadata.rst @@ -54,6 +54,19 @@ area in whichever format it chooses. Later consumers of the metadata will have to agree on the format by some out of band contract (like for the AF_XDP use case, see below).
+It is important to note that some devices may utilize the ``data_meta`` area for +their own purposes. For example, the IGC device utilizes ``IGC_TS_HDR_LEN`` +bytes of the ``data_meta`` area for receiving hardware timestamps. Therefore, +the XDP program should ensure that it does not overwrite any existing metadata. +The metadata layout of such device is depicted below:: + + +----------+-----------------+--------------------------+------+ + | headroom | custom metadata | device-reserved metadata | data | + +----------+-----------------+--------------------------+------+ + ^ ^ + | | + xdp_buff->data_meta xdp_buff->data + AF_XDP ======
@@ -76,6 +89,31 @@ Here is the ``AF_XDP`` consumer layout (note missing ``data_meta`` pointer):: | rx_desc->address
+It is crucial that the agreed ``METADATA_SIZE`` between the BPF program and the +final consumer is sufficient to accommodate both device-reserved metadata and +the data the BPF program needs to populate. When calling +``bpf_xdp_adjust_meta``, the input parameter ``delta`` should be calculated as +``METADATA_SIZE - (xdp_buff->data - xdp_buff->data_meta)``. + +The diagram below provides a visual representation of the calculation of +``delta`` and the overall metadata layout:: + + |<-------------------METADATA_SIZE------------------->| + +----------+--------------------------+--------------------------+------+ + | headroom | custom metadata | device-reserved metadata | data | + +----------+--------------------------+--------------------------+------+ + ^ ^ ^ + | | | + new xdp_buff->data_meta old xdp_buff->data_meta xdp_buff->data + | | + |<----------delta--------->| + +``bpf_xdp_adjust_meta`` ensures that ``METADATA_SIZE`` is aligned to 4 bytes, +does not exceed 252 bytes, and leaves sufficient space for building the +xdp_frame. If these conditions are not met, it returns a negative error. In this +case, the BPF program should not proceed to populate data into the ``data_meta`` +area. + XDP_PASS ========
Introduce the XDP_METADATA_SIZE macro to ensure that user applications can consistently retrieve the correct location of struct xdp_meta.
Prior to this commit, the XDP program adjusted the data_meta backward by the size of struct xdp_meta, while the user application retrieved the data by calculating backward from the data pointer. This approach only worked if xdp_buff->data_meta was equal to xdp_buff->data before calling bpf_xdp_adjust_meta.
With the introduction of XDP_METADATA_SIZE, both the XDP program and user application now calculate and identify the location of struct xdp_meta from the data pointer. This ensures the implementation remains functional even when there is device-reserved metadata, making the tests more portable across different NICs.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com --- tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 2 +- tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++- tools/testing/selftests/bpf/progs/xdp_metadata.c | 8 +++++++- tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_metadata.h | 7 +++++++ 5 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 19f92affc2da..8d6c2633698b 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -302,7 +302,7 @@ static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp)
/* custom metadata */
- meta = data - sizeof(struct xdp_meta); + meta = data - XDP_METADATA_SIZE;
if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) return -1; diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index 330ece2eabdb..72242ac1cdcd 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -27,6 +27,7 @@ extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, SEC("xdp.frags") int rx(struct xdp_md *ctx) { + int metalen_used, metalen_to_adjust; void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; struct udphdr *udp = NULL; @@ -72,7 +73,14 @@ int rx(struct xdp_md *ctx) return XDP_PASS; }
- err = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + metalen_used = ctx->data - ctx->data_meta; + metalen_to_adjust = XDP_METADATA_SIZE - metalen_used; + if (metalen_to_adjust < (int)sizeof(struct xdp_meta)) { + __sync_add_and_fetch(&pkts_skip, 1); + return XDP_PASS; + } + + err = bpf_xdp_adjust_meta(ctx, -metalen_to_adjust); if (err) { __sync_add_and_fetch(&pkts_fail, 1); return XDP_PASS; diff --git a/tools/testing/selftests/bpf/progs/xdp_metadata.c b/tools/testing/selftests/bpf/progs/xdp_metadata.c index 09bb8a038d52..a0ba4ef4bbd8 100644 --- a/tools/testing/selftests/bpf/progs/xdp_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_metadata.c @@ -37,6 +37,7 @@ extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, SEC("xdp") int rx(struct xdp_md *ctx) { + int metalen_used, metalen_to_adjust; void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; struct ethhdr *eth = NULL; @@ -73,7 +74,12 @@ int rx(struct xdp_md *ctx)
/* Reserve enough for all custom metadata. */
- ret = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta)); + metalen_used = ctx->data - ctx->data_meta; + metalen_to_adjust = XDP_METADATA_SIZE - metalen_used; + if (metalen_to_adjust < (int)sizeof(struct xdp_meta)) + return XDP_DROP; + + ret = bpf_xdp_adjust_meta(ctx, -metalen_to_adjust); if (ret != 0) return XDP_DROP;
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 3d8de0d4c96a..a529d55d4ff4 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -223,7 +223,7 @@ static void verify_xdp_metadata(void *data, clockid_t clock_id) { struct xdp_meta *meta;
- meta = data - sizeof(*meta); + meta = data - XDP_METADATA_SIZE;
if (meta->hint_valid & XDP_META_FIELD_RSS) printf("rx_hash: 0x%X with RSS type:0x%X\n", diff --git a/tools/testing/selftests/bpf/xdp_metadata.h b/tools/testing/selftests/bpf/xdp_metadata.h index 87318ad1117a..2dfd3bf5e7bb 100644 --- a/tools/testing/selftests/bpf/xdp_metadata.h +++ b/tools/testing/selftests/bpf/xdp_metadata.h @@ -50,3 +50,10 @@ struct xdp_meta { }; enum xdp_meta_field hint_valid; }; + +/* XDP_METADATA_SIZE must be at least the size of struct xdp_meta. An additional + * 32 bytes of padding is included as a conservative measure to accommodate any + * metadata areas reserved by Ethernet devices. If the device-reserved metadata + * exceeds 32 bytes, this value will need adjustment. + */ +#define XDP_METADATA_SIZE (sizeof(struct xdp_meta) + 32)
On 07/01, Song Yoong Siang wrote:
Introduce the XDP_METADATA_SIZE macro to ensure that user applications can consistently retrieve the correct location of struct xdp_meta.
Prior to this commit, the XDP program adjusted the data_meta backward by the size of struct xdp_meta, while the user application retrieved the data by calculating backward from the data pointer. This approach only worked if xdp_buff->data_meta was equal to xdp_buff->data before calling bpf_xdp_adjust_meta.
With the introduction of XDP_METADATA_SIZE, both the XDP program and user application now calculate and identify the location of struct xdp_meta from the data pointer. This ensures the implementation remains functional even when there is device-reserved metadata, making the tests more portable across different NICs.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com
tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 2 +- tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++- tools/testing/selftests/bpf/progs/xdp_metadata.c | 8 +++++++- tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_metadata.h | 7 +++++++ 5 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c index 19f92affc2da..8d6c2633698b 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -302,7 +302,7 @@ static int verify_xsk_metadata(struct xsk *xsk, bool sent_from_af_xdp) /* custom metadata */
- meta = data - sizeof(struct xdp_meta);
- meta = data - XDP_METADATA_SIZE;
if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) return -1; diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index 330ece2eabdb..72242ac1cdcd 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -27,6 +27,7 @@ extern int bpf_xdp_metadata_rx_vlan_tag(const struct xdp_md *ctx, SEC("xdp.frags") int rx(struct xdp_md *ctx) {
- int metalen_used, metalen_to_adjust; void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; struct udphdr *udp = NULL;
@@ -72,7 +73,14 @@ int rx(struct xdp_md *ctx) return XDP_PASS; }
- err = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
[..]
- metalen_used = ctx->data - ctx->data_meta;
Is the intent here to query how much metadata has been consumed/reserved by the driver? Looking at IGC it has the following code/comment:
bi->xdp->data += IGC_TS_HDR_LEN;
/* HW timestamp has been copied into local variable. Metadata * length when XDP program is called should be 0. */ bi->xdp->data_meta += IGC_TS_HDR_LEN;
Are you sure that metadata size is correctly exposed to the bpf program?
My assumptions was that we should just unconditionally do bpf_xdp_adjust_meta with -XDP_METADATA_SIZE and that should be good enough.
On Wednesday, July 2, 2025 12:31 AM, Stanislav Fomichev stfomichev@gmail.com wrote:
On 07/01, Song Yoong Siang wrote:
Introduce the XDP_METADATA_SIZE macro to ensure that user applications can consistently retrieve the correct location of struct xdp_meta.
Prior to this commit, the XDP program adjusted the data_meta backward by the size of struct xdp_meta, while the user application retrieved the data by calculating backward from the data pointer. This approach only worked if xdp_buff->data_meta was equal to xdp_buff->data before calling bpf_xdp_adjust_meta.
With the introduction of XDP_METADATA_SIZE, both the XDP program and user application now calculate and identify the location of struct xdp_meta from the data pointer. This ensures the implementation remains functional even when there is device-reserved metadata, making the tests more portable across different NICs.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com
tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 2 +- tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++- tools/testing/selftests/bpf/progs/xdp_metadata.c | 8 +++++++- tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_metadata.h | 7 +++++++ 5 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 19f92affc2da..8d6c2633698b 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -302,7 +302,7 @@ static int verify_xsk_metadata(struct xsk *xsk, bool
sent_from_af_xdp)
/* custom metadata */
- meta = data - sizeof(struct xdp_meta);
meta = data - XDP_METADATA_SIZE;
if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) return -1;
diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 330ece2eabdb..72242ac1cdcd 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -27,6 +27,7 @@ extern int bpf_xdp_metadata_rx_vlan_tag(const struct
xdp_md *ctx,
SEC("xdp.frags") int rx(struct xdp_md *ctx) {
- int metalen_used, metalen_to_adjust; void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; struct udphdr *udp = NULL;
@@ -72,7 +73,14 @@ int rx(struct xdp_md *ctx) return XDP_PASS; }
- err = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
[..]
- metalen_used = ctx->data - ctx->data_meta;
Is the intent here to query how much metadata has been consumed/reserved by the driver?
Yes.
Looking at IGC it has the following code/comment:
bi->xdp->data += IGC_TS_HDR_LEN;
/* HW timestamp has been copied into local variable. Metadata * length when XDP program is called should be 0. */ bi->xdp->data_meta += IGC_TS_HDR_LEN;
Are you sure that metadata size is correctly exposed to the bpf program?
You are right, the current igc driver didn't expose the metadata size correctly. I submitted [1] to fix it.
[1] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20250701080955.32...
My assumptions was that we should just unconditionally do bpf_xdp_adjust_meta with -XDP_METADATA_SIZE and that should be good enough.
The checking is just for precautions. No problem if directly adjust the meta unconditionally. That will save processing time for each packet as well. I will remove the checking and submit v2.
Thanks & Regards Siang
On Wednesday, July 2, 2025 10:23 AM, Song, Yoong Siang yoong.siang.song@intel.com wrote:
On Wednesday, July 2, 2025 12:31 AM, Stanislav Fomichev stfomichev@gmail.com wrote:
On 07/01, Song Yoong Siang wrote:
Introduce the XDP_METADATA_SIZE macro to ensure that user applications can consistently retrieve the correct location of struct xdp_meta.
Prior to this commit, the XDP program adjusted the data_meta backward by the size of struct xdp_meta, while the user application retrieved the data by calculating backward from the data pointer. This approach only worked if xdp_buff->data_meta was equal to xdp_buff->data before calling bpf_xdp_adjust_meta.
With the introduction of XDP_METADATA_SIZE, both the XDP program and user application now calculate and identify the location of struct xdp_meta from the data pointer. This ensures the implementation remains functional even when there is device-reserved metadata, making the tests more portable across different NICs.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com
tools/testing/selftests/bpf/prog_tests/xdp_metadata.c | 2 +- tools/testing/selftests/bpf/progs/xdp_hw_metadata.c | 10 +++++++++- tools/testing/selftests/bpf/progs/xdp_metadata.c | 8 +++++++- tools/testing/selftests/bpf/xdp_hw_metadata.c | 2 +- tools/testing/selftests/bpf/xdp_metadata.h | 7 +++++++ 5 files changed, 25 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c
index 19f92affc2da..8d6c2633698b 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_metadata.c @@ -302,7 +302,7 @@ static int verify_xsk_metadata(struct xsk *xsk, bool
sent_from_af_xdp)
/* custom metadata */
- meta = data - sizeof(struct xdp_meta);
meta = data - XDP_METADATA_SIZE;
if (!ASSERT_NEQ(meta->rx_timestamp, 0, "rx_timestamp")) return -1;
diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c
index 330ece2eabdb..72242ac1cdcd 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -27,6 +27,7 @@ extern int bpf_xdp_metadata_rx_vlan_tag(const struct
xdp_md *ctx,
SEC("xdp.frags") int rx(struct xdp_md *ctx) {
- int metalen_used, metalen_to_adjust; void *data, *data_meta, *data_end; struct ipv6hdr *ip6h = NULL; struct udphdr *udp = NULL;
@@ -72,7 +73,14 @@ int rx(struct xdp_md *ctx) return XDP_PASS; }
- err = bpf_xdp_adjust_meta(ctx, -(int)sizeof(struct xdp_meta));
[..]
- metalen_used = ctx->data - ctx->data_meta;
Is the intent here to query how much metadata has been consumed/reserved by the driver?
Yes.
Looking at IGC it has the following code/comment:
bi->xdp->data += IGC_TS_HDR_LEN;
/* HW timestamp has been copied into local variable. Metadata * length when XDP program is called should be 0. */ bi->xdp->data_meta += IGC_TS_HDR_LEN;
Are you sure that metadata size is correctly exposed to the bpf program?
You are right, the current igc driver didn't expose the metadata size correctly. I submitted [1] to fix it.
[1] https://patchwork.ozlabs.org/project/intel-wired- lan/patch/20250701080955.3273137-1-yoong.siang.song@intel.com/
My assumptions was that we should just unconditionally do bpf_xdp_adjust_meta with -XDP_METADATA_SIZE and that should be good enough.
The checking is just for precautions. No problem if directly adjust the meta unconditionally. That will save processing time for each packet as well. I will remove the checking and submit v2.
Thanks & Regards Siang
Hi Stanislav Fomichev,
I submitted v2. But after that, I think twice. IMHO, err = bpf_xdp_adjust_meta(ctx, (int)(ctx->data - ctx->data_meta - XDP_METADATA_SIZE)); is better than err = bpf_xdp_adjust_meta(ctx, -(int)XDP_METADATA_SIZE); because it is more robust.
Any thoughts?
Thanks & Regards Siang
linux-kselftest-mirror@lists.linaro.org