This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1: - rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF) - rename launch-time to txtime
v1: https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-1...
Song Yoong Siang (3): xsk: add ETF support to XDP Tx metadata net: stmmac: Add txtime support to XDP ZC selftests/bpf: Add txtime to xdp_hw_metadata
Documentation/netlink/specs/netdev.yaml | 4 ++++ Documentation/networking/xsk-tx-metadata.rst | 5 +++++ drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ include/net/xdp_sock.h | 9 +++++++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 9 +++++++++ include/uapi/linux/netdev.h | 3 +++ net/core/netdev-genl.c | 2 ++ net/xdp/xsk.c | 3 +++ tools/include/uapi/linux/if_xdp.h | 9 +++++++++ tools/include/uapi/linux/netdev.h | 3 +++ tools/net/ynl/generated/netdev-user.c | 1 + tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++- 14 files changed, 81 insertions(+), 1 deletion(-)
This patch extends the XDP Tx metadata framework to include Earliest TxTime First (ETF) HW offload support where the NIC will schedule the packet for transmission at a pre-determined time called txtime. The value of txtime is communicated from user space to Ethernet driver via txtime field of struct xsk_tx_metadata.
Suggested-by: Stanislav Fomichev sdf@google.com Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com --- Documentation/netlink/specs/netdev.yaml | 4 ++++ Documentation/networking/xsk-tx-metadata.rst | 5 +++++ include/net/xdp_sock.h | 9 +++++++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 9 +++++++++ include/uapi/linux/netdev.h | 3 +++ net/core/netdev-genl.c | 2 ++ net/xdp/xsk.c | 3 +++ tools/include/uapi/linux/if_xdp.h | 9 +++++++++ tools/include/uapi/linux/netdev.h | 3 +++ tools/net/ynl/generated/netdev-user.c | 1 + 11 files changed, 49 insertions(+)
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 00439bcbd2e3..339cdcddbfc0 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -66,6 +66,10 @@ definitions: name: tx-checksum doc: L3 checksum HW offload is supported by the driver. + - + name: txtime + doc: + Earliest TxTime First (ETF) HW offload is supported by the driver.
attribute-sets: - diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index 97ecfa480d00..654ff692062a 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -44,6 +44,10 @@ The flags field enables the particular offload: checksum. ``csum_start`` specifies byte offset of where the checksumming should start and ``csum_offset`` specifies byte offset where the device should store the computed checksum. +- ``XDP_TXMD_FLAGS_TXTIME``: requests Earliest TxTime First (ETF) HW + offload to launch the packet at a pre-determined time. ``txtime`` + indicates the time which the NIC should schedule the packet for + transmission.
Besides the flags above, in order to trigger the offloads, the first packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` @@ -68,6 +72,7 @@ Refer to ``xsk-flags`` features bitmask in
- ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` - ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` +- ``txtime``: device supports ``XDP_TXMD_FLAGS_TXTIME``
See ``tools/net/ynl/samples/netdev.c`` on how to query this information.
diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index 3cb4dc9bd70e..0651b5264b72 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -110,11 +110,16 @@ struct xdp_sock { * indicates position where checksumming should start. * csum_offset indicates position where checksum should be stored. * + * void (*tmo_request_txtime)(u64 txtime, void *priv) + * Called when AF_XDP frame requested Earliest TxTime First (ETF) HW offload + * support. txtime indicates the time which the NIC should schedule the + * packet for transmission. */ struct xsk_tx_metadata_ops { void (*tmo_request_timestamp)(void *priv); u64 (*tmo_fill_timestamp)(void *priv); void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv); + void (*tmo_request_txtime)(u64 txtime, void *priv); };
#ifdef CONFIG_XDP_SOCKETS @@ -170,6 +175,10 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta, if (meta->flags & XDP_TXMD_FLAGS_CHECKSUM) ops->tmo_request_checksum(meta->request.csum_start, meta->request.csum_offset, priv); + + if (ops->tmo_request_txtime) + if (meta->flags & XDP_TXMD_FLAGS_TXTIME) + ops->tmo_request_txtime(meta->request.txtime, priv); }
/** diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 81e02de3f453..e66d597e7079 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -168,6 +168,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) #define XDP_TXMD_FLAGS_VALID ( \ XDP_TXMD_FLAGS_TIMESTAMP | \ XDP_TXMD_FLAGS_CHECKSUM | \ + XDP_TXMD_FLAGS_TXTIME | \ 0)
static inline bool xsk_buff_valid_tx_metadata(struct xsk_tx_metadata *meta) diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index d31698410410..24d123bce7f3 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -123,6 +123,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request Earliest TxTime First (ETF) HW offload to launch the packet at a + * pre-determined time. The time which the NIC should schedule the packet for + * transmission is communicated via txtime field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TXTIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -138,6 +144,9 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_TXTIME */ + __u64 txtime; } request;
struct { diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index 48d5477a668c..03b913757e1c 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_TXTIME: Earliest TxTime First (ETF) HW offload is supported + * by the driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_TXTIME = 3,
/* private: */ NETDEV_XSK_FLAGS_MASK = 3, diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 10f2124e9e23..aecc6e26f839 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -33,6 +33,8 @@ XDP_METADATA_KFUNC_xxx xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP; if (netdev->xsk_tx_metadata_ops->tmo_request_checksum) xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM; + if (netdev->xsk_tx_metadata_ops->tmo_request_txtime) + xsk_features |= NETDEV_XSK_FLAGS_TXTIME; }
if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) || diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 281d49b4fca4..4d48fc6caf7c 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -751,6 +751,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, goto free_err; } } + + if (meta->flags & XDP_TXMD_FLAGS_TXTIME) + skb->skb_mstamp_ns = meta->request.txtime; } }
diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 638c606dfa74..abcf79e8a879 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -123,6 +123,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1)
+/* Request Earliest TxTime First (ETF) HW offload to launch the packet at a + * pre-determined time. The time which the NIC should schedule the packet for + * transmission is communicated via txtime field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_TXTIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -138,6 +144,9 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_TXTIME */ + __u64 txtime; } request;
struct { diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index 48d5477a668c..03b913757e1c 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_TXTIME: Earliest TxTime First (ETF) HW offload is supported + * by the driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_TXTIME = 3,
/* private: */ NETDEV_XSK_FLAGS_MASK = 3, diff --git a/tools/net/ynl/generated/netdev-user.c b/tools/net/ynl/generated/netdev-user.c index 6283d87dad37..02f1bd4cd97b 100644 --- a/tools/net/ynl/generated/netdev-user.c +++ b/tools/net/ynl/generated/netdev-user.c @@ -61,6 +61,7 @@ const char *netdev_xdp_rx_metadata_str(enum netdev_xdp_rx_metadata value) static const char * const netdev_xsk_flags_strmap[] = { [0] = "tx-timestamp", [1] = "tx-checksum", + [2] = "txtime" };
const char *netdev_xsk_flags_str(enum netdev_xsk_flags value)
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index 686c94c2e8a7..e8538af6e207 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -105,6 +105,8 @@ struct stmmac_metadata_request { struct stmmac_priv *priv; struct dma_desc *tx_desc; bool *set_ic; + struct dma_edesc *edesc; + int tbs; };
struct stmmac_xsk_tx_complete { diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index c2ac88aaffed..c7b9338be9e0 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2465,9 +2465,20 @@ static u64 stmmac_xsk_fill_timestamp(void *_priv) return 0; }
+static void stmmac_xsk_request_txtime(u64 txtime, void *_priv) +{ + struct stmmac_metadata_request *meta_req = _priv; + struct timespec64 ts = ns_to_timespec64(txtime); + + if (meta_req->tbs & STMMAC_TBS_EN) + stmmac_set_desc_tbs(meta_req->priv, meta_req->edesc, ts.tv_sec, + ts.tv_nsec); +} + static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { .tmo_request_timestamp = stmmac_xsk_request_timestamp, .tmo_fill_timestamp = stmmac_xsk_fill_timestamp, + .tmo_request_txtime = stmmac_xsk_request_txtime, };
static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) @@ -2545,6 +2556,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) meta_req.priv = priv; meta_req.tx_desc = tx_desc; meta_req.set_ic = &set_ic; + meta_req.tbs = tx_q->tbs; + meta_req.edesc = &tx_q->dma_entx[entry]; xsk_tx_metadata_request(meta, &stmmac_xsk_tx_metadata_ops, &meta_req); if (set_ic) {
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_foll...
Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_foll...
Good point Jesper.
Can we also explicitly document what the type of the field is? Nanoseconds against the NIC hardware clock, it sounds like.
We have some experience with this, too. Something needs to do the conversion from host clock to NIC clock. It is not sufficent to just assume that the host clock is synced against the NIC clock by PTP.
On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
I am using u64 for launch time because existing EDT framework is using it. Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time. I choose u64 because ktime_t often requires additional type conversion and we didn't expect negative value of time.
include/linux/skbuff.h-744- * @tstamp: Time we arrived/left include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure time; start point include/linux/skbuff.h-746- * for retransmit timer -- include/linux/skbuff.h-880- union { include/linux/skbuff.h-881- ktime_t tstamp; include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure time */ include/linux/skbuff.h-883- };
tstamp/skb_mstamp_ns are used by various drivers for launch time support on normal packet, so I think u64 should be "friendly" to all the drivers. For an example, igc driver will take launch time from tstamp and recalculate it accordingly (i225 expect user to program "delta time" instead of "time" into HW register).
drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp; drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0); drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
Do you think this is enough to say the API is sane?
Song, Yoong Siang wrote:
On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
I am using u64 for launch time because existing EDT framework is using it. Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time. I choose u64 because ktime_t often requires additional type conversion and we didn't expect negative value of time.
include/linux/skbuff.h-744- * @tstamp: Time we arrived/left include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure time; start point include/linux/skbuff.h-746- * for retransmit timer -- include/linux/skbuff.h-880- union { include/linux/skbuff.h-881- ktime_t tstamp; include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure time */ include/linux/skbuff.h-883- };
tstamp/skb_mstamp_ns are used by various drivers for launch time support on normal packet, so I think u64 should be "friendly" to all the drivers. For an example, igc driver will take launch time from tstamp and recalculate it accordingly (i225 expect user to program "delta time" instead of "time" into HW register).
drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp; drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0); drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time = igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
Do you think this is enough to say the API is sane?
u64 nsec sounds sane to be. It must be made explicit with clock source it is against.
Some applications could want to do the conversion from a clock source to raw NIC cycle counter in userspace or BPF and program the raw value. So it may be worthwhile to add an clock source argument -- even if initially only CLOCK_MONOTONIC is supported.
See tools/testing/selftests/net/so_txtime.sh for how the FQ and ETF qdiscs already disagree on the clock source that they use.
On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
I am using u64 for launch time because existing EDT framework is using it. Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time. I choose u64 because ktime_t often requires additional type conversion and we didn't expect negative value of time.
include/linux/skbuff.h-744- * @tstamp: Time we arrived/left include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure
time; start point
include/linux/skbuff.h-746- * for retransmit timer
include/linux/skbuff.h-880- union { include/linux/skbuff.h-881- ktime_t tstamp; include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure
time */
include/linux/skbuff.h-883- };
tstamp/skb_mstamp_ns are used by various drivers for launch time support on normal packet, so I think u64 should be "friendly" to all the drivers. For an example, igc driver will take launch time from tstamp and recalculate it accordingly (i225 expect user to program "delta time" instead of "time" into HW register).
drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp; drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0); drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
Do you think this is enough to say the API is sane?
u64 nsec sounds sane to be. It must be made explicit with clock source it is against.
The u64 launch time should base on NIC PTP hardware clock (PHC). I will add documentation saying which clock source it is against
Some applications could want to do the conversion from a clock source to raw NIC cycle counter in userspace or BPF and program the raw value. So it may be worthwhile to add an clock source argument -- even if initially only CLOCK_MONOTONIC is supported.
Sorry, not so understand your suggestion on adding clock source argument. Are you suggesting to add clock source for the selftest xdp_hw_metadata apps? IMHO, no need to add clock source as the clock source for launch time should always base on NIC PHC.
See tools/testing/selftests/net/so_txtime.sh for how the FQ and ETF qdiscs already disagree on the clock source that they use.
Song, Yoong Siang wrote:
On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
I am using u64 for launch time because existing EDT framework is using it. Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time. I choose u64 because ktime_t often requires additional type conversion and we didn't expect negative value of time.
include/linux/skbuff.h-744- * @tstamp: Time we arrived/left include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest departure
time; start point
include/linux/skbuff.h-746- * for retransmit timer
include/linux/skbuff.h-880- union { include/linux/skbuff.h-881- ktime_t tstamp; include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest departure
time */
include/linux/skbuff.h-883- };
tstamp/skb_mstamp_ns are used by various drivers for launch time support on normal packet, so I think u64 should be "friendly" to all the drivers. For an example, igc driver will take launch time from tstamp and recalculate it accordingly (i225 expect user to program "delta time" instead of "time" into HW register).
drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp; drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0); drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
Do you think this is enough to say the API is sane?
u64 nsec sounds sane to be. It must be made explicit with clock source it is against.
The u64 launch time should base on NIC PTP hardware clock (PHC). I will add documentation saying which clock source it is against
It's not that obvious to me that that is the right and only choice. See below.
Some applications could want to do the conversion from a clock source to raw NIC cycle counter in userspace or BPF and program the raw value. So it may be worthwhile to add an clock source argument -- even if initially only CLOCK_MONOTONIC is supported.
Sorry, not so understand your suggestion on adding clock source argument. Are you suggesting to add clock source for the selftest xdp_hw_metadata apps? IMHO, no need to add clock source as the clock source for launch time should always base on NIC PHC.
This is not how FQ and ETF qdiscs pass timestamps to drivers today.
Those are in CLOCK_MONOTONIC or CLOCK_TAI. The driver is expected to convert from that to its descriptor format, both to the reduced bit width and the NIC PHC.
See also for instance how sch_etf has an explicit q->clock_id match, and SO_TXTIME added an sk_clock_id for the same purpose: to agree on which clock source is being used.
On Tuesday, December 5, 2023 10:55 PM, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Monday, December 4, 2023 10:58 PM, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 11:02 PM, Jesper Dangaard Brouer wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This patch enables txtime support to XDP zero copy via XDP Tx metadata framework.
Signed-off-by: Song Yoong Siangyoong.siang.song@intel.com
drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+)
I think we need to see other drivers using this new feature to evaluate if API is sane.
I suggest implementing this for igc driver (chip i225) and also for igb (i210 chip) that both support this kind of LaunchTime feature in HW.
The API and stmmac driver takes a u64 as time. I'm wondering how this applies to i210 that[1] have 25-bit for LaunchTime (with 32 nanosec granularity) limiting LaunchTime max 0.5 second into the future. And i225 that [1] have 30-bit max 1 second into the future.
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
I am using u64 for launch time because existing EDT framework is using it. Refer to struct sk_buff below. Both u64 and ktime_t can be used as launch time. I choose u64 because ktime_t often requires additional type conversion and we didn't expect negative value of time.
include/linux/skbuff.h-744- * @tstamp: Time we arrived/left include/linux/skbuff.h:745- * @skb_mstamp_ns: (aka @tstamp) earliest
departure
time; start point
include/linux/skbuff.h-746- * for retransmit timer
include/linux/skbuff.h-880- union { include/linux/skbuff.h-881- ktime_t tstamp; include/linux/skbuff.h:882- u64 skb_mstamp_ns; /* earliest
departure
time */
include/linux/skbuff.h-883- };
tstamp/skb_mstamp_ns are used by various drivers for launch time support on normal packet, so I think u64 should be "friendly" to all the drivers. For an example, igc driver will take launch time from tstamp and recalculate it accordingly (i225 expect user to program "delta time" instead of "time" into HW register).
drivers/net/ethernet/intel/igc/igc_main.c-1602- txtime = skb->tstamp; drivers/net/ethernet/intel/igc/igc_main.c-1603- skb->tstamp = ktime_set(0, 0); drivers/net/ethernet/intel/igc/igc_main.c:1604- launch_time =
igc_tx_launchtime(tx_ring, txtime, &first_flag, &insert_empty);
Do you think this is enough to say the API is sane?
u64 nsec sounds sane to be. It must be made explicit with clock source it is against.
The u64 launch time should base on NIC PTP hardware clock (PHC). I will add documentation saying which clock source it is against
It's not that obvious to me that that is the right and only choice. See below.
Some applications could want to do the conversion from a clock source to raw NIC cycle counter in userspace or BPF and program the raw value. So it may be worthwhile to add an clock source argument -- even if initially only CLOCK_MONOTONIC is supported.
Sorry, not so understand your suggestion on adding clock source argument. Are you suggesting to add clock source for the selftest xdp_hw_metadata apps? IMHO, no need to add clock source as the clock source for launch time should always base on NIC PHC.
This is not how FQ and ETF qdiscs pass timestamps to drivers today.
Those are in CLOCK_MONOTONIC or CLOCK_TAI. The driver is expected to convert from that to its descriptor format, both to the reduced bit width and the NIC PHC.
See also for instance how sch_etf has an explicit q->clock_id match, and SO_TXTIME added an sk_clock_id for the same purpose: to agree on which clock source is being used.
I see. Thank for the explanation. I will try to add clock source arguments In next version.
This patch adds txtime support to xdp_hw_metadata. User can configure the delta of HW txtime to HW RX-time by using "-l" argument. The default delta is set to 1 second.
This patch is tested with stmmac on Intel Tiger Lake platform. Refer to result below, the delta between pre-determined ETF txtime and actual HW transmit complete time is around 24 us.
$ sudo ./xdp_hw_metadata eth0 ... xsk_ring_cons__peek: 1 0x55fcb80ce7a8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 EoP No rx_hash err=-95 HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to User RX-time sec:0.0002 (237.548 usec) XDP RX-time: 1677764507059280741 (sec:1677764507.0593) delta to User RX-time sec:0.0000 (12.771 usec) 0x55fcb80ce7a8: ping-pong with csum=5619 (want 8626) csum_start=34 csum_offset=6 HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to HW Txtime sec:1.0000 (1000000.000 usec) 0x55fcb80ce7a8: complete tx idx=0 addr=18 HW Txtime: 1677764508059055964 (sec:1677764508.0591) delta to HW TX-complete-time sec:0.0000 (24.235 usec) HW TX-complete-time: 1677764508059080199 (sec:1677764508.0591) delta to User TX-complete-time sec:0.0054 (5423.263 usec) XDP RX-time: 1677764507059280741 (sec:1677764507.0593) delta to User TX-complete-time sec:1.0052 (1005222.721 usec) HW RX-time: 1677764507059055964 (sec:1677764507.0591) delta to HW TX-complete-time sec:1.0000 (1000024.235 usec) 0x55fcb80ce7a8: complete rx idx=128 addr=80100
$ sudo ./xdp_hw_metadata eth0 -l 10000000 ... xsk_ring_cons__peek: 1 0x5626d54de7a8: rx_desc[0]->addr=80100 addr=80100 comp_addr=80100 EoP No rx_hash err=-95 HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to User RX-time sec:0.0002 (240.571 usec) XDP RX-time: 1677764655807942983 (sec:1677764655.8079) delta to User RX-time sec:0.0000 (15.371 usec) 0x5626d54de7a8: ping-pong with csum=5619 (want 8626) csum_start=34 csum_offset=6 HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to HW Txtime sec:0.0100 (10000.000 usec) 0x5626d54de7a8: complete tx idx=0 addr=18 HW Txtime: 1677764655817717783 (sec:1677764655.8177) delta to HW TX-complete-time sec:0.0000 (23.965 usec) HW TX-complete-time: 1677764655817741748 (sec:1677764655.8177) delta to User TX-complete-time sec:0.0003 (291.792 usec) XDP RX-time: 1677764655807942983 (sec:1677764655.8079) delta to User TX-complete-time sec:0.0101 (10090.557 usec) HW RX-time: 1677764655807717783 (sec:1677764655.8077) delta to HW TX-complete-time sec:0.0100 (10023.965 usec) 0x5626d54de7a8: complete rx idx=128 addr=80100
Signed-off-by: Song Yoong Siang yoong.siang.song@intel.com --- tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/xdp_hw_metadata.c b/tools/testing/selftests/bpf/xdp_hw_metadata.c index 3291625ba4fb..e9c3e29dc538 100644 --- a/tools/testing/selftests/bpf/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/xdp_hw_metadata.c @@ -13,6 +13,7 @@ * - UDP 9091 packets trigger TX reply * - TX HW timestamp is requested and reported back upon completion * - TX checksum is requested + * - HW txtime is set for transmission */
#include <test_progs.h> @@ -61,6 +62,8 @@ int rxq; bool skip_tx; __u64 last_hw_rx_timestamp; __u64 last_xdp_rx_timestamp; +__u64 last_txtime; +__u64 txtime_delta_to_hw_rx_timestamp = 1000000000; /* 1 second */
void test__fail(void) { /* for network_helpers.c */ }
@@ -274,6 +277,8 @@ static bool complete_tx(struct xsk *xsk, clockid_t clock_id) if (meta->completion.tx_timestamp) { __u64 ref_tstamp = gettime(clock_id);
+ print_tstamp_delta("HW Txtime", "HW TX-complete-time", + last_txtime, meta->completion.tx_timestamp); print_tstamp_delta("HW TX-complete-time", "User TX-complete-time", meta->completion.tx_timestamp, ref_tstamp); print_tstamp_delta("XDP RX-time", "User TX-complete-time", @@ -371,6 +376,13 @@ static void ping_pong(struct xsk *xsk, void *rx_packet, clockid_t clock_id) xsk, ntohs(udph->check), ntohs(want_csum), meta->request.csum_start, meta->request.csum_offset);
+ /* Set txtime for Earliest TxTime First (ETF) */ + meta->flags |= XDP_TXMD_FLAGS_TXTIME; + meta->request.txtime = last_hw_rx_timestamp + txtime_delta_to_hw_rx_timestamp; + last_txtime = meta->request.txtime; + print_tstamp_delta("HW RX-time", "HW Txtime", last_hw_rx_timestamp, + meta->request.txtime); + memcpy(data, rx_packet, len); /* don't share umem chunk for simplicity */ tx_desc->options |= XDP_TX_METADATA; tx_desc->len = len; @@ -595,6 +607,7 @@ static void print_usage(void) " -h Display this help and exit\n\n" " -m Enable multi-buffer XDP for larger MTU\n" " -r Don't generate AF_XDP reply (rx metadata only)\n" + " -l Delta of HW Txtime to HW RX-time in ns (default: 1s)\n" "Generate test packets on the other machine with:\n" " echo -n xdp | nc -u -q1 <dst_ip> 9091\n";
@@ -605,7 +618,7 @@ static void read_args(int argc, char *argv[]) { int opt;
- while ((opt = getopt(argc, argv, "chmr")) != -1) { + while ((opt = getopt(argc, argv, "chmrl:")) != -1) { switch (opt) { case 'c': bind_flags &= ~XDP_USE_NEED_WAKEUP; @@ -621,6 +634,9 @@ static void read_args(int argc, char *argv[]) case 'r': skip_tx = true; break; + case 'l': + txtime_delta_to_hw_rx_timestamp = atoll(optarg); + break; case '?': if (isprint(optopt)) fprintf(stderr, "Unknown option: -%c\n", optopt);
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between: - tmo_request_timestamp and tmo_request_txtime
[1] https://github.com/xdp-project/xdp-project/blob/master/areas/tsn/code01_foll...
v1: https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-1...
Song Yoong Siang (3): xsk: add ETF support to XDP Tx metadata net: stmmac: Add txtime support to XDP ZC selftests/bpf: Add txtime to xdp_hw_metadata
Documentation/netlink/specs/netdev.yaml | 4 ++++ Documentation/networking/xsk-tx-metadata.rst | 5 +++++ drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ include/net/xdp_sock.h | 9 +++++++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 9 +++++++++ include/uapi/linux/netdev.h | 3 +++ net/core/netdev-genl.c | 2 ++ net/xdp/xsk.c | 3 +++ tools/include/uapi/linux/if_xdp.h | 9 +++++++++ tools/include/uapi/linux/netdev.h | 3 +++ tools/net/ynl/generated/netdev-user.c | 1 + tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++- 14 files changed, 81 insertions(+), 1 deletion(-)
On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer hawk@kernel.org wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime
Hi Jesper and Willem,
How about using "launch_time" for the flag/variable and "Earliest TxTime First" for the description/comments?
Thanks & Regards Siang
[1] https://github.com/xdp-project/xdp- project/blob/master/areas/tsn/code01_follow_qdisc_TSN_offload.org
v1:
https://patchwork.kernel.org/project/netdevbpf/cover/20231130162028.852006-1... yoong.siang.song@intel.com/
Song Yoong Siang (3): xsk: add ETF support to XDP Tx metadata net: stmmac: Add txtime support to XDP ZC selftests/bpf: Add txtime to xdp_hw_metadata
Documentation/netlink/specs/netdev.yaml | 4 ++++ Documentation/networking/xsk-tx-metadata.rst | 5 +++++ drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ .../net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ include/net/xdp_sock.h | 9 +++++++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 9 +++++++++ include/uapi/linux/netdev.h | 3 +++ net/core/netdev-genl.c | 2 ++ net/xdp/xsk.c | 3 +++ tools/include/uapi/linux/if_xdp.h | 9 +++++++++ tools/include/uapi/linux/netdev.h | 3 +++ tools/net/ynl/generated/netdev-user.c | 1 + tools/testing/selftests/bpf/xdp_hw_metadata.c | 18 +++++++++++++++++- 14 files changed, 81 insertions(+), 1 deletion(-)
Song, Yoong Siang wrote:
On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer hawk@kernel.org wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime
Hi Jesper and Willem,
How about using "launch_time" for the flag/variable and "Earliest TxTime First" for the description/comments?
I don't particularly care which term we use, as long as we're consistent. Especially, don't keep introducing new synonyms.
The fact that one happens to be one vendor's marketing term does not make it preferable, IMHO. On the contrary.
SO_TXTIME is in the ABI, and EDT has been used publicly in kernel patches and conference talks, e.g., Van Jacobson's Netdev 0x12 keynote. Those are vendor agnostic commonly used terms.
But as long as Launch Time is not an Intel only trademark, fine to select that.
On 12/1/23 16:09, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer hawk@kernel.org wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime
Hi Jesper and Willem,
How about using "launch_time" for the flag/variable and "Earliest TxTime First" for the description/comments?
I don't follow why you are calling the feature: - "Earliest TxTime First" (ETF). - AFAIK this just reference an qdisc name (that most don't know exists)
I don't particularly care which term we use, as long as we're consistent. Especially, don't keep introducing new synonyms.
The fact that one happens to be one vendor's marketing term does not make it preferable, IMHO. On the contrary.
These kind of hardware features are defined as part of Time Sensitive Networking (TSN). I believe these TSN features are defined as part of IEEE 802.1Qbv (2015) and according to Wikipedia[2] incorporated into IEEE 802.1Q.
[2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking
SO_TXTIME is in the ABI, and EDT has been used publicly in kernel patches and conference talks, e.g., Van Jacobson's Netdev 0x12 keynote. Those are vendor agnostic commonly used terms.
I agree that EDT (Earliest Departure Time) have become a thing and term in our community. We could associate this feature with this. I do fear what hardware behavior will be it if I e.g. ask it to send a packet 2 sec in the future on i225 which max support 1 sec. Will hardware send it at 1 sec? Because then I'm violating the *Earliest* Departure Time.
But as long as Launch Time is not an Intel only trademark, fine to select that.
The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I don't like to for us to name this after this. This features is simply taking advantage of exposing one of the hardware building blocks (controlling/setting packet "launch time") that can be used for implementing a TAS.
I like the name "launch time" because it doesn't get easily confused with other timestamps, and intuitively describes packet will be send at a specific time (likely in future).
--Jesper
Jesper Dangaard Brouer wrote:
On 12/1/23 16:09, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer hawk@kernel.org wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime
Hi Jesper and Willem,
How about using "launch_time" for the flag/variable and "Earliest TxTime First" for the description/comments?
I don't follow why you are calling the feature:
- "Earliest TxTime First" (ETF).
- AFAIK this just reference an qdisc name (that most don't know exists)
I don't particularly care which term we use, as long as we're consistent. Especially, don't keep introducing new synonyms.
The fact that one happens to be one vendor's marketing term does not make it preferable, IMHO. On the contrary.
These kind of hardware features are defined as part of Time Sensitive Networking (TSN). I believe these TSN features are defined as part of IEEE 802.1Qbv (2015) and according to Wikipedia[2] incorporated into IEEE 802.1Q.
[2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking
SO_TXTIME is in the ABI, and EDT has been used publicly in kernel patches and conference talks, e.g., Van Jacobson's Netdev 0x12 keynote. Those are vendor agnostic commonly used terms.
I agree that EDT (Earliest Departure Time) have become a thing and term in our community. We could associate this feature with this. I do fear what hardware behavior will be it if I e.g. ask it to send a packet 2 sec in the future on i225 which max support 1 sec. Will hardware send it at 1 sec? Because then I'm violating the *Earliest* Departure Time.
That should definitely not happen. At least not on a device that implements EDT semantics.
This relates to Jakub's question in the previous thread on whether this mechanism allows out-of-order transmission or maintains FIFO behavior. That really is device specific.
Older devices only support this for low rate (PTP) and with a small fixed number of outstanding requests. For pacing offload, devices need to support up to linerate and out-of-order.
I don't think we want to enforce either in software, as the hardware is already out there. But it would be good if drivers can somehow label these capabilities. Including programmable horizon.
It is up to the qdisc to ensure that it does not pass packets to the device beyond its horizon.
ETF and FQ already have a concept of horizon. And a way to queue errors for packets out of bound (SO_EE_CODE_TXTIME_..).
But as long as Launch Time is not an Intel only trademark, fine to select that.
The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I don't like to for us to name this after this. This features is simply taking advantage of exposing one of the hardware building blocks (controlling/setting packet "launch time") that can be used for implementing a TAS.
I like the name "launch time" because it doesn't get easily confused with other timestamps, and intuitively describes packet will be send at a specific time (likely in future).
--Jesper
Understood on your point that txtime and tx_timestamp are too similar. As said, I don't care strongly. Launch time sounds fine to me. Others can speak up if they disagree.
I take launch time as a less strict than EDT: it is a request to send at a certain time, with no strict definition on uncertainty. While EDT more strictly ensures that a packet is not sent before the timestamp.
On Saturday, December 2, 2023 10:16 PM, Willem de Bruijn wrote:
Jesper Dangaard Brouer wrote:
On 12/1/23 16:09, Willem de Bruijn wrote:
Song, Yoong Siang wrote:
On Friday, December 1, 2023 6:46 PM, Jesper Dangaard Brouer
hawk@kernel.org wrote:
On 12/1/23 07:24, Song Yoong Siang wrote:
This series expands XDP TX metadata framework to include ETF HW offload.
Changes since v1:
- rename Time-Based Scheduling (TBS) to Earliest TxTime First (ETF)
- rename launch-time to txtime
I strongly disagree with this renaming (sorry to disagree with Willem).
The i210 and i225 chips call this LaunchTime in their programmers datasheets, and even in the driver code[1].
Using this "txtime" name in the code is also confusing, because how can people reading the code know the difference between:
- tmo_request_timestamp and tmo_request_txtime
Hi Jesper and Willem,
How about using "launch_time" for the flag/variable and "Earliest TxTime First" for the description/comments?
I don't follow why you are calling the feature:
- "Earliest TxTime First" (ETF).
- AFAIK this just reference an qdisc name (that most don't know exists)
I don't particularly care which term we use, as long as we're consistent. Especially, don't keep introducing new synonyms.
The fact that one happens to be one vendor's marketing term does not make it preferable, IMHO. On the contrary.
These kind of hardware features are defined as part of Time Sensitive Networking (TSN). I believe these TSN features are defined as part of IEEE 802.1Qbv (2015) and according to Wikipedia[2] incorporated into IEEE 802.1Q.
[2] https://en.wikipedia.org/wiki/Time-Sensitive_Networking
SO_TXTIME is in the ABI, and EDT has been used publicly in kernel patches and conference talks, e.g., Van Jacobson's Netdev 0x12 keynote. Those are vendor agnostic commonly used terms.
I agree that EDT (Earliest Departure Time) have become a thing and term in our community. We could associate this feature with this. I do fear what hardware behavior will be it if I e.g. ask it to send a packet 2 sec in the future on i225 which max support 1 sec. Will hardware send it at 1 sec? Because then I'm violating the *Earliest* Departure Time.
That should definitely not happen. At least not on a device that implements EDT semantics.
This relates to Jakub's question in the previous thread on whether this mechanism allows out-of-order transmission or maintains FIFO behavior. That really is device specific.
Older devices only support this for low rate (PTP) and with a small fixed number of outstanding requests. For pacing offload, devices need to support up to linerate and out-of-order.
I don't think we want to enforce either in software, as the hardware is already out there. But it would be good if drivers can somehow label these capabilities. Including programmable horizon.
It is up to the qdisc to ensure that it does not pass packets to the device beyond its horizon.
ETF and FQ already have a concept of horizon. And a way to queue errors for packets out of bound (SO_EE_CODE_TXTIME_..).
But as long as Launch Time is not an Intel only trademark, fine to select that.
The IEEE 802.1Qbv is sometimes called Time-Aware Shaper (TAS), but I don't like to for us to name this after this. This features is simply taking advantage of exposing one of the hardware building blocks (controlling/setting packet "launch time") that can be used for implementing a TAS.
I like the name "launch time" because it doesn't get easily confused with other timestamps, and intuitively describes packet will be send at a specific time (likely in future).
--Jesper
Understood on your point that txtime and tx_timestamp are too similar. As said, I don't care strongly. Launch time sounds fine to me. Others can speak up if they disagree.
I take launch time as a less strict than EDT: it is a request to send at a certain time, with no strict definition on uncertainty. While EDT more strictly ensures that a packet is not sent before the timestamp.
Thanks for the deep discussion and information. I agree with launch time too. I will submit v3 with launch time so that others can review on the new naming and provide their feedback.
linux-kselftest-mirror@lists.linaro.org