When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel): 1) ping -f $CHROMEBOX (from workstation on same local network) 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX 3) ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to stop working.
Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler grundler@chromium.org Tested-by: Grant Grundler grundler@chromium.org Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com --- v2: - update comment and title. --- drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 62cabeeb842a..bb787a52bc75 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -196,6 +196,7 @@ enum rtl_registers { /* No threshold before first PCI xfer */ #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) #define RX_EARLY_OFF (1 << 11) +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */ #define RXCFG_DMA_SHIFT 8 /* Unlimited maximum PCI burst. */ #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); break; - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63: + case RTL_GIGA_MAC_VER_61: RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); break; + case RTL_GIGA_MAC_VER_63: + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST | + RX_PAUSE_SLOT_ON); + break; default: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); break;
On 11/29/2023 7:53 AM, ChunHao Lin wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel):
- ping -f $CHROMEBOX (from workstation on same local network)
- run "powerd_dbus_suspend" from command line on the $CHROMEBOX
- ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to stop working.
Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state.
Makes sense. Avoiding the spam is good. The naming is a bit confusing but I guess that comes from realtek datasheet?
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler grundler@chromium.org Tested-by: Grant Grundler grundler@chromium.org Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com
v2:
- update comment and title.
drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 62cabeeb842a..bb787a52bc75 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -196,6 +196,7 @@ enum rtl_registers { /* No threshold before first PCI xfer */ #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) #define RX_EARLY_OFF (1 << 11) +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that from a different set of devices? We're writing to the same register RxConfig here I think in both cases?
Can you clarify if these are supposed to be the same bit?
#define RXCFG_DMA_SHIFT 8 /* Unlimited maximum PCI burst. */ #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); break;
- case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
- case RTL_GIGA_MAC_VER_61: RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); break;
I assume there isn't a VER_62 between these?
- case RTL_GIGA_MAC_VER_63:
RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
RX_PAUSE_SLOT_ON);
We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes sense.
default: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); break;break;
On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller jacob.e.keller@intel.com wrote:
On 11/29/2023 7:53 AM, ChunHao Lin wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel):
- ping -f $CHROMEBOX (from workstation on same local network)
- run "powerd_dbus_suspend" from command line on the $CHROMEBOX
- ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to stop working.
Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state.
Makes sense. Avoiding the spam is good. The naming is a bit confusing but I guess that comes from realtek datasheet?
I don't know. It doesn't matter to me what it's called since I don't have access to the data sheet anyway. :/
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler grundler@chromium.org Tested-by: Grant Grundler grundler@chromium.org Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com
v2:
- update comment and title.
drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 62cabeeb842a..bb787a52bc75 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -196,6 +196,7 @@ enum rtl_registers { /* No threshold before first PCI xfer */ #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) #define RX_EARLY_OFF (1 << 11) +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that from a different set of devices?
Yes, for a different HW version of the device.
We're writing to the same register RxConfig here I think in both cases?
Yes. But to different versions of the HW which use this bit differently. Ergo the comment about "8125b and later".
Can you clarify if these are supposed to be the same bit?
Yes, they are the same bit - but different versions of HW use BIT(11) differently.
#define RXCFG_DMA_SHIFT 8 /* Unlimited maximum PCI burst. */ #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); break;
case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
case RTL_GIGA_MAC_VER_61: RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); break;
I assume there isn't a VER_62 between these?
Correct. My clue is this code near the top of this file:
149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3}, 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */ 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2},
case RTL_GIGA_MAC_VER_63:
RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
RX_PAUSE_SLOT_ON);
We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes sense.
Exactly.
thanks for reviewing!
cheers, grant
break; default: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); break;
On 11/29/2023 3:40 PM, Grant Grundler wrote:
On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller jacob.e.keller@intel.com wrote:
On 11/29/2023 7:53 AM, ChunHao Lin wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel):
- ping -f $CHROMEBOX (from workstation on same local network)
- run "powerd_dbus_suspend" from command line on the $CHROMEBOX
- ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to stop working.
Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state.
Makes sense. Avoiding the spam is good. The naming is a bit confusing but I guess that comes from realtek datasheet?
I don't know. It doesn't matter to me what it's called since I don't have access to the data sheet anyway. :/
The name is fine, i just found it a bit hard to parse since its effectively "PAUSE_SLOT_ON" makes us *not* send pause frames forever.
I think its fine as-is, since this is referring to the use of the pause slot in hardware.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler grundler@chromium.org Tested-by: Grant Grundler grundler@chromium.org Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com
v2:
- update comment and title.
drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 62cabeeb842a..bb787a52bc75 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -196,6 +196,7 @@ enum rtl_registers { /* No threshold before first PCI xfer */ #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) #define RX_EARLY_OFF (1 << 11) +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that from a different set of devices?
Yes, for a different HW version of the device.
Great.
We're writing to the same register RxConfig here I think in both cases?
Yes. But to different versions of the HW which use this bit differently. Ergo the comment about "8125b and later".
Can you clarify if these are supposed to be the same bit?
Yes, they are the same bit - but different versions of HW use BIT(11) differently.
Thanks for the clarification!
#define RXCFG_DMA_SHIFT 8 /* Unlimited maximum PCI burst. */ #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); break;
case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
case RTL_GIGA_MAC_VER_61: RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); break;
I assume there isn't a VER_62 between these?
Correct. My clue is this code near the top of this file:
149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3}, 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */ 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2},
case RTL_GIGA_MAC_VER_63:
RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
RX_PAUSE_SLOT_ON);
We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes sense.
Exactly.
thanks for reviewing!
Great.
For the record:
Reviewed-by: Jacob Keller jacob.e.keller@intel.com
On Wed, 29 Nov 2023 23:53:50 +0800 ChunHao Lin wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
Heiner, looks good?
On 29.11.2023 16:53, ChunHao Lin wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
This issue can be reproduced in Chromebox (not Chromebook) in developer mode running a test image (and v5.10 kernel):
- ping -f $CHROMEBOX (from workstation on same local network)
- run "powerd_dbus_suspend" from command line on the $CHROMEBOX
- ping $ROUTER (wait until ping fails from workstation)
Takes about ~20-30 seconds after step 2 for the local network to stop working.
Fix this issue by enabling pause slot to only send pause frame once when FIFO reaches near full state.
Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") Reported-by: Grant Grundler grundler@chromium.org Tested-by: Grant Grundler grundler@chromium.org Cc: stable@vger.kernel.org Signed-off-by: ChunHao Lin hau@realtek.com
v2:
- update comment and title.
Reviewed-by: Heiner Kallweit hkallweit1@gmail.com
Hello:
This patch was applied to netdev/net.git (main) by Jakub Kicinski kuba@kernel.org:
On Wed, 29 Nov 2023 23:53:50 +0800 you wrote:
When FIFO reaches near full state, device will issue pause frame. If pause slot is enabled(set to 1), in this time, device will issue pause frame only once. But if pause slot is disabled(set to 0), device will keep sending pause frames until FIFO reaches near empty state.
When pause slot is disabled, if there is no one to handle receive packets, device FIFO will reach near full state and keep sending pause frames. That will impact entire local area network.
[...]
Here is the summary with links: - [net,v2] r8169: fix rtl8125b PAUSE frames blasting when suspended https://git.kernel.org/netdev/net/c/4b0768b6556a
You are awesome, thank you!
linux-stable-mirror@lists.linaro.org