We found some bugs when testing the XDP function of enetc driver, and these bugs are easy to reproduce. This is not only causes XDP to not work, but also the network cannot be restored after exiting the XDP program. So the patch set is mainly to fix these bugs. For details, please see the commit message of each patch.
Wei Fang (3): net: enetc: remove xdp_drops statistic from enetc_xdp_drop() net: enetc: fix the issues of XDP_REDIRECT feature net: enetc: reset xdp_tx_in_flight when updating bpf program
drivers/net/ethernet/freescale/enetc/enetc.c | 46 +++++++++++++++----- drivers/net/ethernet/freescale/enetc/enetc.h | 1 + 2 files changed, 37 insertions(+), 10 deletions(-)
The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
Fixes: 7ed2bc80074e ("net: enetc: add support for XDP_TX") Signed-off-by: Wei Fang wei.fang@nxp.com --- drivers/net/ethernet/freescale/enetc/enetc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 032d8eadd003..56e59721ec7d 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1521,7 +1521,6 @@ static void enetc_xdp_drop(struct enetc_bdr *rx_ring, int rx_ring_first, &rx_ring->rx_swbd[rx_ring_first]); enetc_bdr_idx_inc(rx_ring, &rx_ring_first); } - rx_ring->stats.xdp_drops++; }
static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, @@ -1586,6 +1585,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, fallthrough; case XDP_DROP: enetc_xdp_drop(rx_ring, orig_i, i); + rx_ring->stats.xdp_drops++; break; case XDP_PASS: rxbd = orig_rxbd;
Hi,
Thanks for your patch.
FYI: kernel test robot notices the stable kernel rule is not satisfied.
The check is based on https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html#opti...
Rule: add the tag "Cc: stable@vger.kernel.org" in the sign-off area to have the patch automatically included in the stable tree. Subject: [PATCH net 1/3] net: enetc: remove xdp_drops statistic from enetc_xdp_drop() Link: https://lore.kernel.org/stable/20240919084104.661180-2-wei.fang%40nxp.com
On Thu, Sep 19, 2024 at 04:41:02PM +0800, Wei Fang wrote:
The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
Fixes: 7ed2bc80074e ("net: enetc: add support for XDP_TX") Signed-off-by: Wei Fang wei.fang@nxp.com
Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com
drivers/net/ethernet/freescale/enetc/enetc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 032d8eadd003..56e59721ec7d 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1521,7 +1521,6 @@ static void enetc_xdp_drop(struct enetc_bdr *rx_ring, int rx_ring_first, &rx_ring->rx_swbd[rx_ring_first]); enetc_bdr_idx_inc(rx_ring, &rx_ring_first); }
- rx_ring->stats.xdp_drops++;
} static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, @@ -1586,6 +1585,7 @@ static int enetc_clean_rx_ring_xdp(struct enetc_bdr *rx_ring, fallthrough; case XDP_DROP: enetc_xdp_drop(rx_ring, orig_i, i);
case XDP_PASS: rxbd = orig_rxbd;rx_ring->stats.xdp_drops++; break;
-- 2.34.1
On 2024-09-19 at 14:11:02, Wei Fang (wei.fang@nxp.com) wrote:
The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
nit: s/xdp_drops/xdp_rx_drops would be appropriate as you have xdp_tx_drops and xdp_redirect_failures.
-----Original Message----- From: Ratheesh Kannoth rkannoth@marvell.com Sent: 2024年9月23日 13:03 To: Wei Fang wei.fang@nxp.com Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; Vladimir Oltean vladimir.oltean@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 1/3] net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
On 2024-09-19 at 14:11:02, Wei Fang (wei.fang@nxp.com) wrote:
The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
nit: s/xdp_drops/xdp_rx_drops would be appropriate as you have xdp_tx_drops and xdp_redirect_failures.
Sorry, I don't quite understand what you mean.
On Mon, Sep 23, 2024 at 08:53:07AM +0300, Wei Fang wrote:
-----Original Message----- From: Ratheesh Kannoth rkannoth@marvell.com Sent: 2024年9月23日 13:03 To: Wei Fang wei.fang@nxp.com Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; Vladimir Oltean vladimir.oltean@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 1/3] net: enetc: remove xdp_drops statistic from enetc_xdp_drop()
On 2024-09-19 at 14:11:02, Wei Fang (wei.fang@nxp.com) wrote:
The xdp_drops statistic indicates the number of XDP frames dropped in the Rx direction. However, enetc_xdp_drop() is also used in XDP_TX and XDP_REDIRECT actions. If frame loss occurs in these two actions, the frames loss count should not be included in xdp_drops, because there are already xdp_tx_drops and xdp_redirect_failures to count the frame loss of these two actions, so it's better to remove xdp_drops statistic from enetc_xdp_drop() and increase xdp_drops in XDP_DROP action.
nit: s/xdp_drops/xdp_rx_drops would be appropriate as you have xdp_tx_drops and xdp_redirect_failures.
Sorry, I don't quite understand what you mean.
I don't understand what he means either. I guess he didn't realize you aren't proposing any new name, just working with existing concepts in the driver. Anyway, an ack about this from Ratheesh would be great, to not leave us hanging.
When testing the XDP_REDIRECT function on the LS1028A platform, we found a very reproducible issue that the Tx frames can no longer be sent out even if XDP_REDIRECT is turned off. Specifically, if there is a lot of traffic on Rx direction, when XDP_REDIRECT is turned on, the console may display some warnings like "timeout for tx ring #6 clear", and all redirected frames will be dropped, the detaild log is as follows.
root@ls1028ardb:~# ./xdp-bench redirect eno0 eno2 Redirecting from eno0 (ifindex 3; driver fsl_enetc) to eno2 (ifindex 4; driver fsl_enetc) [203.849809] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #5 clear [204.006051] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear [204.161944] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear eno0->eno2 1420505 rx/s 1420590 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420590 drop/s 0 drv_err/s 15.71 bulk-avg eno0->eno2 1420484 rx/s 1420485 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420485 drop/s 0 drv_err/s 15.71 bulk-avg
By analyzing the XDP_REDIRECT implementation of enetc driver, we found two problems. First, enetc driver will reconfigure Tx and Rx BD rings when a bpf program is installed or uninstalled, but there is no mechanisms to block the redirected frames when enetc driver reconfigures BD rings. So introduce ENETC_TX_DOWN flag to prevent the redirected frames to be attached to Tx BD rings.
Second, Tx BD rings are disabled first in enetc_stop() and then wait for empty. This operation is not safe while the Tx BD ring is actively transmitting frames, and will cause the ring to not be empty and hardware exception. As described in the block guide of LS1028A NETC, software should only disable an active ring after all pending ring entries have been consumed (i.e. when PI = CI). Disabling a transmit ring that is actively processing BDs risks a HW-SW race hazard whereby a hardware resource becomes assigned to work on one or more ring entries only to have those entries be removed due to the ring becoming disabled. So the correct behavior is that the software stops putting frames on the Tx BD rings (this is what ENETC_TX_DOWN does), then waits for the Tx BD rings to be empty, and finally disables the Tx BD rings.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com --- drivers/net/ethernet/freescale/enetc/enetc.c | 43 ++++++++++++++++---- drivers/net/ethernet/freescale/enetc/enetc.h | 1 + 2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 56e59721ec7d..5830c046cb7d 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -902,6 +902,7 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget)
if (unlikely(tx_frm_cnt && netif_carrier_ok(ndev) && __netif_subqueue_stopped(ndev, tx_ring->index) && + !test_bit(ENETC_TX_DOWN, &priv->flags) && (enetc_bd_unused(tx_ring) >= ENETC_TXBDS_MAX_NEEDED))) { netif_wake_subqueue(ndev, tx_ring->index); } @@ -1377,6 +1378,9 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames, int xdp_tx_bd_cnt, i, k; int xdp_tx_frm_cnt = 0;
+ if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags))) + return -ENETDOWN; + enetc_lock_mdio();
tx_ring = priv->xdp_tx_ring[smp_processor_id()]; @@ -2223,18 +2227,24 @@ static void enetc_enable_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_rxbdr_wr(hw, idx, ENETC_RBMR, rbmr); }
-static void enetc_enable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_enable_rx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i;
- for (i = 0; i < priv->num_tx_rings; i++) - enetc_enable_txbdr(hw, priv->tx_ring[i]); - for (i = 0; i < priv->num_rx_rings; i++) enetc_enable_rxbdr(hw, priv->rx_ring[i]); }
+static void enetc_enable_tx_bdrs(struct enetc_ndev_priv *priv) +{ + struct enetc_hw *hw = &priv->si->hw; + int i; + + for (i = 0; i < priv->num_tx_rings; i++) + enetc_enable_txbdr(hw, priv->tx_ring[i]); +} + static void enetc_disable_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) { int idx = rx_ring->index; @@ -2251,7 +2261,16 @@ static void enetc_disable_txbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_txbdr_wr(hw, idx, ENETC_TBMR, 0); }
-static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_disable_rx_bdrs(struct enetc_ndev_priv *priv) +{ + struct enetc_hw *hw = &priv->si->hw; + int i; + + for (i = 0; i < priv->num_rx_rings; i++) + enetc_disable_rxbdr(hw, priv->rx_ring[i]); +} + +static void enetc_disable_tx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i; @@ -2259,8 +2278,6 @@ static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) for (i = 0; i < priv->num_tx_rings; i++) enetc_disable_txbdr(hw, priv->tx_ring[i]);
- for (i = 0; i < priv->num_rx_rings; i++) - enetc_disable_rxbdr(hw, priv->rx_ring[i]); }
static void enetc_wait_txbdr(struct enetc_hw *hw, struct enetc_bdr *tx_ring) @@ -2452,6 +2469,8 @@ void enetc_start(struct net_device *ndev)
enetc_setup_interrupts(priv);
+ enetc_enable_tx_bdrs(priv); + for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i); @@ -2460,9 +2479,11 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
- enetc_enable_bdrs(priv); + enetc_enable_rx_bdrs(priv);
netif_tx_start_all_queues(ndev); + + clear_bit(ENETC_TX_DOWN, &priv->flags); } EXPORT_SYMBOL_GPL(enetc_start);
@@ -2520,9 +2541,11 @@ void enetc_stop(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
+ set_bit(ENETC_TX_DOWN, &priv->flags); + netif_tx_stop_all_queues(ndev);
- enetc_disable_bdrs(priv); + enetc_disable_rx_bdrs(priv);
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, @@ -2535,6 +2558,8 @@ void enetc_stop(struct net_device *ndev)
enetc_wait_bdrs(priv);
+ enetc_disable_tx_bdrs(priv); + enetc_clear_interrupts(priv); } EXPORT_SYMBOL_GPL(enetc_stop); diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index 97524dfa234c..fb7d98d57783 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -325,6 +325,7 @@ enum enetc_active_offloads {
enum enetc_flags_bit { ENETC_TX_ONESTEP_TSTAMP_IN_PROGRESS = 0, + ENETC_TX_DOWN, };
/* interrupt coalescing modes */
On Thu, Sep 19, 2024 at 04:41:03PM +0800, Wei Fang wrote:
When testing the XDP_REDIRECT function on the LS1028A platform, we found a very reproducible issue that the Tx frames can no longer be sent out even if XDP_REDIRECT is turned off. Specifically, if there is a lot of traffic on Rx direction, when XDP_REDIRECT is turned on, the console may display some warnings like "timeout for tx ring #6 clear", and all redirected frames will be dropped, the detaild log is as follows.
root@ls1028ardb:~# ./xdp-bench redirect eno0 eno2 Redirecting from eno0 (ifindex 3; driver fsl_enetc) to eno2 (ifindex 4; driver fsl_enetc) [203.849809] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #5 clear [204.006051] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #6 clear [204.161944] fsl_enetc 0000:00:00.2 eno2: timeout for tx ring #7 clear eno0->eno2 1420505 rx/s 1420590 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420590 drop/s 0 drv_err/s 15.71 bulk-avg eno0->eno2 1420484 rx/s 1420485 err,drop/s 0 xmit/s xmit eno0->eno2 0 xmit/s 1420485 drop/s 0 drv_err/s 15.71 bulk-avg
By analyzing the XDP_REDIRECT implementation of enetc driver, we found two problems. First, enetc driver will reconfigure Tx and Rx BD rings when a bpf program is installed or uninstalled, but there is no mechanisms to block the redirected frames when enetc driver reconfigures BD rings. So introduce ENETC_TX_DOWN flag to prevent the redirected frames to be attached to Tx BD rings.
Second, Tx BD rings are disabled first in enetc_stop() and then wait for empty. This operation is not safe while the Tx BD ring is actively transmitting frames, and will cause the ring to not be empty and hardware exception. As described in the block guide of LS1028A NETC, software should only disable an active ring after all pending ring entries have been consumed (i.e. when PI = CI). Disabling a transmit ring that is actively processing BDs risks a HW-SW race hazard whereby a hardware resource becomes assigned to work on one or more ring entries only to have those entries be removed due to the ring becoming disabled. So the correct behavior is that the software stops putting frames on the Tx BD rings (this is what ENETC_TX_DOWN does), then waits for the Tx BD rings to be empty, and finally disables the Tx BD rings.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com
Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com
drivers/net/ethernet/freescale/enetc/enetc.c | 43 ++++++++++++++++---- drivers/net/ethernet/freescale/enetc/enetc.h | 1 + 2 files changed, 35 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 56e59721ec7d..5830c046cb7d 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -902,6 +902,7 @@ static bool enetc_clean_tx_ring(struct enetc_bdr *tx_ring, int napi_budget) if (unlikely(tx_frm_cnt && netif_carrier_ok(ndev) && __netif_subqueue_stopped(ndev, tx_ring->index) &&
netif_wake_subqueue(ndev, tx_ring->index); }!test_bit(ENETC_TX_DOWN, &priv->flags) && (enetc_bd_unused(tx_ring) >= ENETC_TXBDS_MAX_NEEDED))) {
@@ -1377,6 +1378,9 @@ int enetc_xdp_xmit(struct net_device *ndev, int num_frames, int xdp_tx_bd_cnt, i, k; int xdp_tx_frm_cnt = 0;
- if (unlikely(test_bit(ENETC_TX_DOWN, &priv->flags)))
return -ENETDOWN;
- enetc_lock_mdio();
tx_ring = priv->xdp_tx_ring[smp_processor_id()]; @@ -2223,18 +2227,24 @@ static void enetc_enable_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_rxbdr_wr(hw, idx, ENETC_RBMR, rbmr); } -static void enetc_enable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_enable_rx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i;
- for (i = 0; i < priv->num_tx_rings; i++)
enetc_enable_txbdr(hw, priv->tx_ring[i]);
- for (i = 0; i < priv->num_rx_rings; i++) enetc_enable_rxbdr(hw, priv->rx_ring[i]);
} +static void enetc_enable_tx_bdrs(struct enetc_ndev_priv *priv) +{
- struct enetc_hw *hw = &priv->si->hw;
- int i;
- for (i = 0; i < priv->num_tx_rings; i++)
enetc_enable_txbdr(hw, priv->tx_ring[i]);
+}
static void enetc_disable_rxbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) { int idx = rx_ring->index; @@ -2251,7 +2261,16 @@ static void enetc_disable_txbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_txbdr_wr(hw, idx, ENETC_TBMR, 0); } -static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_disable_rx_bdrs(struct enetc_ndev_priv *priv) +{
- struct enetc_hw *hw = &priv->si->hw;
- int i;
- for (i = 0; i < priv->num_rx_rings; i++)
enetc_disable_rxbdr(hw, priv->rx_ring[i]);
+}
+static void enetc_disable_tx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i; @@ -2259,8 +2278,6 @@ static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) for (i = 0; i < priv->num_tx_rings; i++) enetc_disable_txbdr(hw, priv->tx_ring[i]);
- for (i = 0; i < priv->num_rx_rings; i++)
enetc_disable_rxbdr(hw, priv->rx_ring[i]);
} static void enetc_wait_txbdr(struct enetc_hw *hw, struct enetc_bdr *tx_ring) @@ -2452,6 +2469,8 @@ void enetc_start(struct net_device *ndev) enetc_setup_interrupts(priv);
- enetc_enable_tx_bdrs(priv);
- for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i);
@@ -2460,9 +2479,11 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
- enetc_enable_bdrs(priv);
- enetc_enable_rx_bdrs(priv);
netif_tx_start_all_queues(ndev);
- clear_bit(ENETC_TX_DOWN, &priv->flags);
} EXPORT_SYMBOL_GPL(enetc_start); @@ -2520,9 +2541,11 @@ void enetc_stop(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
- set_bit(ENETC_TX_DOWN, &priv->flags);
- netif_tx_stop_all_queues(ndev);
- enetc_disable_bdrs(priv);
- enetc_disable_rx_bdrs(priv);
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, @@ -2535,6 +2558,8 @@ void enetc_stop(struct net_device *ndev) enetc_wait_bdrs(priv);
- enetc_disable_tx_bdrs(priv);
- enetc_clear_interrupts(priv);
} EXPORT_SYMBOL_GPL(enetc_stop); diff --git a/drivers/net/ethernet/freescale/enetc/enetc.h b/drivers/net/ethernet/freescale/enetc/enetc.h index 97524dfa234c..fb7d98d57783 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.h +++ b/drivers/net/ethernet/freescale/enetc/enetc.h @@ -325,6 +325,7 @@ enum enetc_active_offloads { enum enetc_flags_bit { ENETC_TX_ONESTEP_TSTAMP_IN_PROGRESS = 0,
- ENETC_TX_DOWN,
}; /* interrupt coalescing modes */ -- 2.34.1
On Thu, Sep 19, 2024 at 04:41:03PM +0800, Wei Fang wrote:
@@ -2251,7 +2261,16 @@ static void enetc_disable_txbdr(struct enetc_hw *hw, struct enetc_bdr *rx_ring) enetc_txbdr_wr(hw, idx, ENETC_TBMR, 0); } -static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_disable_rx_bdrs(struct enetc_ndev_priv *priv) +{
- struct enetc_hw *hw = &priv->si->hw;
- int i;
- for (i = 0; i < priv->num_rx_rings; i++)
enetc_disable_rxbdr(hw, priv->rx_ring[i]);
+}
+static void enetc_disable_tx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i; @@ -2259,8 +2278,6 @@ static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) for (i = 0; i < priv->num_tx_rings; i++) enetc_disable_txbdr(hw, priv->tx_ring[i]);
Please do not leave a blank line here. In the git tree after applying this patch, that blank line appears at the end of enetc_disable_tx_bdrs().
- for (i = 0; i < priv->num_rx_rings; i++)
enetc_disable_rxbdr(hw, priv->rx_ring[i]);
}
-----Original Message----- From: Vladimir Oltean vladimir.oltean@nxp.com Sent: 2024年9月27日 22:42 To: Wei Fang wei.fang@nxp.com Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 2/3] net: enetc: fix the issues of XDP_REDIRECT feature
On Thu, Sep 19, 2024 at 04:41:03PM +0800, Wei Fang wrote:
@@ -2251,7 +2261,16 @@ static void enetc_disable_txbdr(struct enetc_hw
*hw, struct enetc_bdr *rx_ring)
enetc_txbdr_wr(hw, idx, ENETC_TBMR, 0); }
-static void enetc_disable_bdrs(struct enetc_ndev_priv *priv) +static void enetc_disable_rx_bdrs(struct enetc_ndev_priv *priv) +{
- struct enetc_hw *hw = &priv->si->hw;
- int i;
- for (i = 0; i < priv->num_rx_rings; i++)
enetc_disable_rxbdr(hw, priv->rx_ring[i]);
+}
+static void enetc_disable_tx_bdrs(struct enetc_ndev_priv *priv) { struct enetc_hw *hw = &priv->si->hw; int i; @@ -2259,8 +2278,6 @@ static void enetc_disable_bdrs(struct
enetc_ndev_priv *priv)
for (i = 0; i < priv->num_tx_rings; i++) enetc_disable_txbdr(hw, priv->tx_ring[i]);
Please do not leave a blank line here. In the git tree after applying this patch, that blank line appears at the end of enetc_disable_tx_bdrs().
Thanks for reminder, it's weird that the checkpatch.pl did not raise a warning here.
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com --- drivers/net/ethernet/freescale/enetc/enetc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 5830c046cb7d..3cff76923ab9 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -2769,6 +2769,7 @@ static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx) for (i = 0; i < priv->num_rx_rings; i++) { struct enetc_bdr *rx_ring = priv->rx_ring[i];
+ rx_ring->xdp.xdp_tx_in_flight = 0; rx_ring->xdp.prog = prog;
if (prog)
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
This does not explain why enetc_recycle_xdp_tx_buff(), which decreases xdp_tx_in_flight, does not get called?
In patch 2/3 you wrote:
| Tx BD rings are disabled first in enetc_stop() and then | wait for empty. This operation is not safe while the Tx BD ring | is actively transmitting frames, and will cause the ring to not | be empty and hardware exception. As described in the block guide | of LS1028A NETC, software should only disable an active ring after | all pending ring entries have been consumed (i.e. when PI = CI). | Disabling a transmit ring that is actively processing BDs risks | a HW-SW race hazard whereby a hardware resource becomes assigned | to work on one or more ring entries only to have those entries be | removed due to the ring becoming disabled. So the correct behavior | is that the software stops putting frames on the Tx BD rings (this | is what ENETC_TX_DOWN does), then waits for the Tx BD rings to be | empty, and finally disables the Tx BD rings.
I'm surprised that after fixing that, this change would still be needed, rather than xdp_tx_in_flight naturally dropping down to 0 when stopping NAPI. Why doesn't that happen, and what happens to the pending XDP_TX buffers?
-----Original Message----- From: Vladimir Oltean vladimir.oltean@nxp.com Sent: 2024年9月19日 21:22 To: Wei Fang wei.fang@nxp.com Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()")
This does not explain why enetc_recycle_xdp_tx_buff(), which decreases xdp_tx_in_flight, does not get called?
In patch 2/3 you wrote:
| Tx BD rings are disabled first in enetc_stop() and then wait for | empty. This operation is not safe while the Tx BD ring is actively | transmitting frames, and will cause the ring to not be empty and | hardware exception. As described in the block guide of LS1028A NETC, | software should only disable an active ring after all pending ring | entries have been consumed (i.e. when PI = CI). | Disabling a transmit ring that is actively processing BDs risks a | HW-SW race hazard whereby a hardware resource becomes assigned to work | on one or more ring entries only to have those entries be removed due | to the ring becoming disabled. So the correct behavior is that the | software stops putting frames on the Tx BD rings (this is what | ENETC_TX_DOWN does), then waits for the Tx BD rings to be empty, and | finally disables the Tx BD rings.
I'm surprised that after fixing that, this change would still be needed, rather than xdp_tx_in_flight naturally dropping down to 0 when stopping NAPI. Why doesn't that happen, and what happens to the pending XDP_TX buffers?
The reason is that interrupt is disabled (disable_irq() is called in enetc_stop()) so enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX frames are sent out and XDP_TX buffers will be freed by enetc_free_rxtx_rings(). So there is no noticeable impact.
Another solution is that move disable_irq() to the end of enetc_stop(), so that the IRQ is still active until the Tx is finished. In this case, the xdp_tx_in_flight will naturally drop down to 0 as you expect.
On 2024-09-20 at 08:42:06, Wei Fang (wei.fang@nxp.com) wrote:
enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX frames are sent out and XDP_TX buffers will be freed by enetc_free_rxtx_rings().
why didn't you choose enetc_free_rxtx_rings() to reset inflight count to 0 ?
-----Original Message----- From: Ratheesh Kannoth rkannoth@marvell.com Sent: 2024年9月23日 12:56 To: Wei Fang wei.fang@nxp.com Cc: Vladimir Oltean vladimir.oltean@nxp.com; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program
On 2024-09-20 at 08:42:06, Wei Fang (wei.fang@nxp.com) wrote:
enetc_recycle_xdp_tx_buff() will not be called. Actually all XDP_TX frames are sent out and XDP_TX buffers will be freed by
enetc_free_rxtx_rings(). why didn't you choose enetc_free_rxtx_rings() to reset inflight count to 0 ?
IMO, I think enetc_reconfigure_xdp_cb() is more appropriate to reset xdp_tx_in_flight than enetc_free_rxtx_rings().
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com
drivers/net/ethernet/freescale/enetc/enetc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 5830c046cb7d..3cff76923ab9 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -2769,6 +2769,7 @@ static int enetc_reconfigure_xdp_cb(struct enetc_ndev_priv *priv, void *ctx) for (i = 0; i < priv->num_rx_rings; i++) { struct enetc_bdr *rx_ring = priv->rx_ring[i];
rx_ring->xdp.xdp_tx_in_flight = 0;
zero init is good but shouldn't you be draining these buffers when removing XDP resources at least? what happens with DMA mappings that are related to these cached buffers?
rx_ring->xdp.prog = prog;
if (prog) -- 2.34.1
-----Original Message----- From: Maciej Fijalkowski maciej.fijalkowski@intel.com Sent: 2024年9月20日 21:05 To: Wei Fang wei.fang@nxp.com Cc: davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; Vladimir Oltean vladimir.oltean@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
Summary 0 rx/s 0
err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com
drivers/net/ethernet/freescale/enetc/enetc.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 5830c046cb7d..3cff76923ab9 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -2769,6 +2769,7 @@ static int enetc_reconfigure_xdp_cb(struct
enetc_ndev_priv *priv, void *ctx)
for (i = 0; i < priv->num_rx_rings; i++) { struct enetc_bdr *rx_ring = priv->rx_ring[i];
rx_ring->xdp.xdp_tx_in_flight = 0;
zero init is good but shouldn't you be draining these buffers when removing XDP resources at least? what happens with DMA mappings that are related to these cached buffers?
All the buffers will be freed and DMA will be unmapped when XDP program is installed. I am thinking that another solution may be better, which is mentioned in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally drop to 0, and the TX-related statistics will be more accurate.
On Fri, Sep 20, 2024 at 05:05:14PM +0300, Wei Fang wrote:
zero init is good but shouldn't you be draining these buffers when removing XDP resources at least? what happens with DMA mappings that are related to these cached buffers?
All the buffers will be freed and DMA will be unmapped when XDP program is installed.
There is still a problem with the patch you proposed here, which is that enetc_reconfigure() has one more call site, from enetc_hwtstamp_set(). If enetc_free_rxtx_rings() is the one that gets rid of the stale buffers, it should also be the one that resets xdp_tx_in_flight, otherwise you will still leave the problem unsolved where XDP_TX can be interrupted by a change in hwtstamping state, and the software "in flight" counter gets out of sync with the ring state.
Also, I suspect that the blamed commit is wrong. Also the normal netdev close path should be susceptible to this issue, not just enetc_reconfigure(). Maybe something like ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing"). That's when we started rushing the NAPI poll routing to finish. I don't think it was possible, before that, to close the netdev while there were XDP_TX frames pending to be recycled.
I am thinking that another solution may be better, which is mentioned in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally drop to 0, and the TX-related statistics will be more accurate.
Please give me some more time to analyze the flow after just your patch 2/3. I have a draft reply, but I would still like to test some things.
-----Original Message----- From: Vladimir Oltean vladimir.oltean@nxp.com Sent: 2024年9月20日 22:25 To: Wei Fang wei.fang@nxp.com Cc: Maciej Fijalkowski maciej.fijalkowski@intel.com; davem@davemloft.net; edumazet@google.com; kuba@kernel.org; pabeni@redhat.com; Claudiu Manoil claudiu.manoil@nxp.com; ast@kernel.org; daniel@iogearbox.net; hawk@kernel.org; john.fastabend@gmail.com; linux-kernel@vger.kernel.org; netdev@vger.kernel.org; bpf@vger.kernel.org; stable@vger.kernel.org; imx@lists.linux.dev Subject: Re: [PATCH net 3/3] net: enetc: reset xdp_tx_in_flight when updating bpf program
On Fri, Sep 20, 2024 at 05:05:14PM +0300, Wei Fang wrote:
zero init is good but shouldn't you be draining these buffers when removing XDP resources at least? what happens with DMA mappings that are related
to
these cached buffers?
All the buffers will be freed and DMA will be unmapped when XDP program
is
installed.
There is still a problem with the patch you proposed here, which is that enetc_reconfigure() has one more call site, from enetc_hwtstamp_set(). If enetc_free_rxtx_rings() is the one that gets rid of the stale buffers, it should also be the one that resets xdp_tx_in_flight, otherwise you will still leave the problem unsolved where XDP_TX can be interrupted by a change in hwtstamping state, and the software "in flight" counter gets out of sync with the ring state.
Yes, you are right. It's a potential issue if RX_TSTAMP is set when XDP is also enabled.
Also, I suspect that the blamed commit is wrong. Also the normal netdev close path should be susceptible to this issue, not just enetc_reconfigure(). Maybe something like ff58fda09096 ("net: enetc: prioritize ability to go down over packet processing").
Thanks for the reminder, I will change the blamed commit in next version
That's when we started rushing the NAPI poll routing to finish. I don't think it was possible, before that, to close the netdev while there were XDP_TX frames pending to be recycled.
I am thinking that another solution may be better, which is mentioned in another thread replying to Vladimir, so that xdp_tx_in_flight will naturally
drop
to 0, and the TX-related statistics will be more accurate.
Please give me some more time to analyze the flow after just your patch 2/3. I have a draft reply, but I would still like to test some things.
Okay, I have tested this solution (see changes below), and from what I observed, the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no other risks, the next version will use this solution.
@@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
- enetc_setup_interrupts(priv); - - enetc_enable_tx_bdrs(priv); - for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i); @@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
+ enetc_setup_interrupts(priv); + + enetc_enable_tx_bdrs(priv); + enetc_enable_rx_bdrs(priv);
netif_tx_start_all_queues(ndev); @@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
enetc_disable_rx_bdrs(priv);
+ enetc_wait_bdrs(priv); + + enetc_disable_tx_bdrs(priv); + + enetc_clear_interrupts(priv); + for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i); @@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev) napi_synchronize(&priv->int_vector[i]->napi); napi_disable(&priv->int_vector[i]->napi); } - - enetc_wait_bdrs(priv); - - enetc_disable_tx_bdrs(priv); - - enetc_clear_interrupts(priv); } EXPORT_SYMBOL_GPL(enetc_stop);
Hi Wei,
On Mon, Sep 23, 2024 at 04:59:56AM +0300, Wei Fang wrote:
Okay, I have tested this solution (see changes below), and from what I observed, the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no other risks, the next version will use this solution.
Sorry for the delay. I have tested this variant and it works. Just one thing below.
@@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
enetc_setup_interrupts(priv);
enetc_enable_tx_bdrs(priv);
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i);
@@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
enetc_setup_interrupts(priv);
enetc_enable_tx_bdrs(priv);
enetc_enable_rx_bdrs(priv); netif_tx_start_all_queues(ndev);
@@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
enetc_disable_rx_bdrs(priv);
enetc_wait_bdrs(priv);
enetc_disable_tx_bdrs(priv);
enetc_clear_interrupts(priv);
Here, NAPI may still be scheduled. So if you clear interrupts, enetc_poll() on another CPU might still have time to re-enable them. This makes the call pointless.
Please move the enetc_clear_interrupts() call after the for() loop below (AKA leave it where it is).
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev, ENETC_BDR_INT_BASE_IDX + i);
@@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev) napi_synchronize(&priv->int_vector[i]->napi); napi_disable(&priv->int_vector[i]->napi); }
enetc_wait_bdrs(priv);
enetc_disable_tx_bdrs(priv);
enetc_clear_interrupts(priv);
} EXPORT_SYMBOL_GPL(enetc_stop);
FWIW, there are at least 2 other valid ways of solving this problem. One has already been mentioned (reset the counter in enetc_free_rx_ring()):
@@ -2014,6 +2015,8 @@ static void enetc_free_rx_ring(struct enetc_bdr *rx_ring) __free_page(rx_swbd->page); rx_swbd->page = NULL; } + + rx_ring->xdp.xdp_tx_in_flight = 0; }
static void enetc_free_rxtx_rings(struct enetc_ndev_priv *priv)
And the other would be to keep rescheduling NAPI until there are no more pending XDP_TX frames.
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 3cff76923ab9..36520f8c49a6 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1689,6 +1689,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog); else work_done = enetc_clean_rx_ring(rx_ring, napi, budget); - if (work_done == budget) + if (work_done == budget || rx_ring->xdp.xdp_tx_in_flight) complete = false; if (work_done)
But I like your second proposal the best. It doesn't involve adding an unnecessary extra test in the fast path.
Hi Wei,
On Mon, Sep 23, 2024 at 04:59:56AM +0300, Wei Fang wrote:
Okay, I have tested this solution (see changes below), and from what I
observed,
the xdp_tx_in_flight can naturally drop to 0 in every test. So if there are no
other
risks, the next version will use this solution.
Sorry for the delay. I have tested this variant and it works. Just one thing below.
@@ -2467,10 +2469,6 @@ void enetc_start(struct net_device *ndev) struct enetc_ndev_priv *priv = netdev_priv(ndev); int i;
enetc_setup_interrupts(priv);
enetc_enable_tx_bdrs(priv);
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev,
ENETC_BDR_INT_BASE_IDX + i);
@@ -2479,6 +2477,10 @@ void enetc_start(struct net_device *ndev) enable_irq(irq); }
enetc_setup_interrupts(priv);
enetc_enable_tx_bdrs(priv);
enetc_enable_rx_bdrs(priv); netif_tx_start_all_queues(ndev);
@@ -2547,6 +2549,12 @@ void enetc_stop(struct net_device *ndev)
enetc_disable_rx_bdrs(priv);
enetc_wait_bdrs(priv);
enetc_disable_tx_bdrs(priv);
enetc_clear_interrupts(priv);
Here, NAPI may still be scheduled. So if you clear interrupts, enetc_poll() on another CPU might still have time to re-enable them. This makes the call pointless.
Please move the enetc_clear_interrupts() call after the for() loop below (AKA leave it where it is).
Okay, I will, thanks.
for (i = 0; i < priv->bdr_int_num; i++) { int irq = pci_irq_vector(priv->si->pdev,
ENETC_BDR_INT_BASE_IDX + i);
@@ -2555,12 +2563,6 @@ void enetc_stop(struct net_device *ndev) napi_synchronize(&priv->int_vector[i]->napi); napi_disable(&priv->int_vector[i]->napi); }
enetc_wait_bdrs(priv);
enetc_disable_tx_bdrs(priv);
enetc_clear_interrupts(priv);
} EXPORT_SYMBOL_GPL(enetc_stop);
FWIW, there are at least 2 other valid ways of solving this problem. One has already been mentioned (reset the counter in enetc_free_rx_ring()):
@@ -2014,6 +2015,8 @@ static void enetc_free_rx_ring(struct enetc_bdr *rx_ring) __free_page(rx_swbd->page); rx_swbd->page = NULL; }
- rx_ring->xdp.xdp_tx_in_flight = 0;
}
static void enetc_free_rxtx_rings(struct enetc_ndev_priv *priv)
And the other would be to keep rescheduling NAPI until there are no more pending XDP_TX frames.
diff --git a/drivers/net/ethernet/freescale/enetc/enetc.c b/drivers/net/ethernet/freescale/enetc/enetc.c index 3cff76923ab9..36520f8c49a6 100644 --- a/drivers/net/ethernet/freescale/enetc/enetc.c +++ b/drivers/net/ethernet/freescale/enetc/enetc.c @@ -1689,6 +1689,7 @@ static int enetc_poll(struct napi_struct *napi, int budget) work_done = enetc_clean_rx_ring_xdp(rx_ring, napi, budget, prog); else work_done = enetc_clean_rx_ring(rx_ring, napi, budget);
- if (work_done == budget)
- if (work_done == budget || rx_ring->xdp.xdp_tx_in_flight) complete = false; if (work_done)
But I like your second proposal the best. It doesn't involve adding an unnecessary extra test in the fast path.
On Thu, Sep 19, 2024 at 04:41:04PM +0800, Wei Fang wrote:
When running "xdp-bench tx eno0" to test the XDP_TX feature of ENETC on LS1028A, it was found that if the command was re-run multiple times, Rx could not receive the frames, and the result of xdo-bench showed that the rx rate was 0.
root@ls1028ardb:~# ./xdp-bench tx eno0 Hairpinning (XDP_TX) packets on eno0 (ifindex 3; driver fsl_enetc) Summary 2046 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s Summary 0 rx/s 0 err,drop/s
By observing the Rx PIR and CIR registers, we found that CIR is always equal to 0x7FF and PIR is always 0x7FE, which means that the Rx ring is full and can no longer accommodate other Rx frames. Therefore, it is obvious that the RX BD ring has not been cleaned up.
I wouldn't call "obvious" something that you had to go and put debug prints for. There's nothing obvious about the manifestation of the issue.
Further analysis of the code revealed that the Rx BD ring will only be cleaned if the "cleaned_cnt > xdp_tx_in_flight" condition is met. Therefore, some debug logs were added to the driver and the current values of cleaned_cnt and xdp_tx_in_flight were printed when the Rx BD ring was full. The logs are as follows.
[ 178.762419] [XDP TX] >> cleaned_cnt:1728, xdp_tx_in_flight:2140 [ 178.771387] [XDP TX] >> cleaned_cnt:1941, xdp_tx_in_flight:2110 [ 178.776058] [XDP TX] >> cleaned_cnt:1792, xdp_tx_in_flight:2110
From the results, we can see that the maximum value of xdp_tx_in_flight has reached 2140. However, the size of the Rx BD ring is only 2048. This is incredible, so checked the code again and found that the driver did not reset xdp_tx_in_flight when installing or uninstalling bpf program, resulting in xdp_tx_in_flight still retaining the value after the last command was run.
When you resubmit, please be careful to readjust the commit message to the new patch. Especially this paragraph, but also the title.
Fixes: c33bfaf91c4c ("net: enetc: set up XDP program under enetc_reconfigure()") Cc: stable@vger.kernel.org Signed-off-by: Wei Fang wei.fang@nxp.com
linux-stable-mirror@lists.linaro.org