Testing with RTL8822BE hardware, when available memory is low, we frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci] [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics, such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:105! invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new skb first in RX ISR. If we don't have memory available, we discard the current frame, allowing the existing skb to be reused in the ring. Otherwise, we simplify the code flow and just hand over the RX-populated skb over to mac80211.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Reviewed-by: Daniel Drake drake@endlessm.com Cc: stable@vger.kernel.org --- drivers/net/wireless/realtek/rtw88/pci.c | 28 +++++++++++------------- 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..1bfc99ae6b84 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, rx_desc = skb->data; chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
+ /* discard current skb if the new skb cannot be allocated as a + * new one in rx ring later + * */ + new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE); + if (WARN(!new, "rx routine starvation\n")) { + new = skb; + goto next_rp; + } + /* offset from rx_desc to payload */ pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift; @@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
- /* alloc a smaller skb to mac80211 */ - new = dev_alloc_skb(pkt_stat.pkt_len); - if (!new) { - new = skb; - } else { - skb_put_data(new, skb->data, skb->len); - dev_kfree_skb_any(skb); - } /* TODO: merge into rx.c */ rtw_rx_stats(rtwdev, pkt_stat.vif, skb); - memcpy(new->cb, &rx_status, sizeof(rx_status)); - ieee80211_rx_irqsafe(rtwdev->hw, new); + memcpy(skb->cb, &rx_status, sizeof(rx_status)); + ieee80211_rx_irqsafe(rtwdev->hw, skb); }
- /* skb delivered to mac80211, alloc a new one in rx ring */ - new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE); - if (WARN(!new, "rx routine starvation\n")) - return; - +next_rp: + /* skb delivered to mac80211, attach the new one into rx ring */ ring->buf[cur_rp] = new; rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
Subject: [PATCH] rtw88/pci: Rearrange the memory usage for skb in RX ISR
nit, "rtw88: pci:" would be better.
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new skb first in RX ISR. If we don't have memory available, we discard the current frame, allowing the existing skb to be reused in the ring. Otherwise, we simplify the code flow and just hand over the RX-populated skb over to mac80211.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Reviewed-by: Daniel Drake drake@endlessm.com Cc: stable@vger.kernel.org
drivers/net/wireless/realtek/rtw88/pci.c | 28 +++++++++++------------- 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..1bfc99ae6b84 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, rx_desc = skb->data; chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
* */
nit, comment indentation.
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n")) {
new = skb;
goto next_rp;
}
- /* offset from rx_desc to payload */ pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data, skb->len);
dev_kfree_skb_any(skb);
}
I am not sure if it's fine to deliver every huge SKB to mac80211. Because it will then be delivered to TCP/IP stack. Hence I think either it should be tested to know if the performance would be impacted or find out a more efficient way to send smaller SKB to mac80211 stack.
/* TODO: merge into rx.c */ rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
memcpy(new->cb, &rx_status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, new);
memcpy(skb->cb, &rx_status, sizeof(rx_status));
}ieee80211_rx_irqsafe(rtwdev->hw, skb);
/* skb delivered to mac80211, alloc a new one in rx ring */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n"))
return;
+next_rp:
ring->buf[cur_rp] = new; rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);/* skb delivered to mac80211, attach the new one into rx ring */
--
Yan-Hsuan
Tony Chuang yhchuang@realtek.com 於 2019年7月8日 週一 下午3:23寫道:
Subject: [PATCH] rtw88/pci: Rearrange the memory usage for skb in RX ISR
nit, "rtw88: pci:" would be better.
Ok.
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new skb first in RX ISR. If we don't have memory available, we discard the current frame, allowing the existing skb to be reused in the ring. Otherwise, we simplify the code flow and just hand over the RX-populated skb over to mac80211.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Reviewed-by: Daniel Drake drake@endlessm.com Cc: stable@vger.kernel.org
drivers/net/wireless/realtek/rtw88/pci.c | 28 +++++++++++------------- 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..1bfc99ae6b84 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, rx_desc = skb->data; chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
* */
nit, comment indentation.
Thanks. I will fix this.
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n")) {
new = skb;
goto next_rp;
}
/* offset from rx_desc to payload */ pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data, skb->len);
dev_kfree_skb_any(skb);
}
I am not sure if it's fine to deliver every huge SKB to mac80211. Because it will then be delivered to TCP/IP stack. Hence I think either it should be tested to know if the performance would be impacted or find out a more efficient way to send smaller SKB to mac80211 stack.
I remember network stack only processes the skb with(in) pointers (skb->data) and the skb->len for data part. It also checks real buffer boundary (head and end) of the skb to prevent memory overflow. Therefore, I think using the original skb is the most efficient way.
If I misunderstand something, please point out.
/* TODO: merge into rx.c */ rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
memcpy(new->cb, &rx_status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, new);
memcpy(skb->cb, &rx_status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, skb); }
/* skb delivered to mac80211, alloc a new one in rx ring */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n"))
return;
+next_rp:
/* skb delivered to mac80211, attach the new one into rx ring */ ring->buf[cur_rp] = new; rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
--
Yan-Hsuan
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev
*rtwdev,
struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data,
skb->len);
dev_kfree_skb_any(skb);
}
I am not sure if it's fine to deliver every huge SKB to mac80211. Because it will then be delivered to TCP/IP stack. Hence I think either it should be tested to know if the performance would be impacted or find out a more efficient way to send smaller SKB to mac80211 stack.
I remember network stack only processes the skb with(in) pointers (skb->data) and the skb->len for data part. It also checks real buffer boundary (head and end) of the skb to prevent memory overflow. Therefore, I think using the original skb is the most efficient way.
If I misunderstand something, please point out.
It means if we still use a huge SKB (~8K) for every RX packet (~1.5K). There is about 6.5K not used. And even more if we ping with large packet size "eg. $ ping -s 65536", I am not sure if those huge SKBs will eat all of the SKB mem pool, and then ping fails.
BTW, the original design of RTK_PCI_RX_BUF_SIZE to be (8192 + 24) is to receive AMSDU packet in one SKB. (Could probably enlarge it to RX VHT AMSDU ~11K)
Yan-Hsuan
From: Tony Chuang
Sent: 08 July 2019 10:00
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev
*rtwdev,
struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data,
skb->len);
dev_kfree_skb_any(skb);
}
I am not sure if it's fine to deliver every huge SKB to mac80211. Because it will then be delivered to TCP/IP stack. Hence I think either it should be tested to know if the performance would be impacted or find out a more efficient way to send smaller SKB to mac80211 stack.
I remember network stack only processes the skb with(in) pointers (skb->data) and the skb->len for data part. It also checks real buffer boundary (head and end) of the skb to prevent memory overflow. Therefore, I think using the original skb is the most efficient way.
If I misunderstand something, please point out.
It means if we still use a huge SKB (~8K) for every RX packet (~1.5K). There is about 6.5K not used. And even more if we ping with large packet size "eg. $ ping -s 65536", I am not sure if those huge SKBs will eat all of the SKB mem pool, and then ping fails.
BTW, the original design of RTK_PCI_RX_BUF_SIZE to be (8192 + 24) is to receive AMSDU packet in one SKB. (Could probably enlarge it to RX VHT AMSDU ~11K)
If you allocate 8192+24 the memory allocated will be either 12k or 16k and the skb truesize set appropriately. (Probably 16k if dma memory.) If this is fed into IP it is quite likely that a single byte of data will end up queued on the socket in 16k of dma-able memory. The 'truesize' stops this using all the system memory, but it isn't good for memory usage.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
From: Jian-Hong Pan
Sent: 08 July 2019 07:33 To: Yan-Hsuan Chuang; Kalle Valo; David S . Miller
Testing with RTL8822BE hardware, when available memory is low, we frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci] [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics, such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:105! invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new skb first in RX ISR. If we don't have memory available, we discard the current frame, allowing the existing skb to be reused in the ring. Otherwise, we simplify the code flow and just hand over the RX-populated skb over to mac80211.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Under low memory conditions it may be preferable to limit the amount of memory assigned to the receive ring.
I also thought it was preferable (DM may correct me here) to do the skb allocates from the 'bh' of the driver rather than from the hardware interrupt.
It is also almost certainly preferable (especially on IOMMU systems) to copy small frames into a new skb (of the right size) and then reuse the skb (with its dma-mapped buffer) for a later frame.
Allocating a new skb before ay px processing just seems wrong...
David
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Reviewed-by: Daniel Drake drake@endlessm.com Cc: stable@vger.kernel.org
drivers/net/wireless/realtek/rtw88/pci.c | 28 +++++++++++------------- 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..1bfc99ae6b84 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, rx_desc = skb->data; chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
* */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n")) {
new = skb;
goto next_rp;
}
- /* offset from rx_desc to payload */ pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, skb_put(skb, pkt_stat.pkt_len); skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data, skb->len);
dev_kfree_skb_any(skb);
} /* TODO: merge into rx.c */ rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
memcpy(new->cb, &rx_status, sizeof(rx_status));
ieee80211_rx_irqsafe(rtwdev->hw, new);
memcpy(skb->cb, &rx_status, sizeof(rx_status));
}ieee80211_rx_irqsafe(rtwdev->hw, skb);
/* skb delivered to mac80211, alloc a new one in rx ring */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n"))
return;
+next_rp:
ring->buf[cur_rp] = new; rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);/* skb delivered to mac80211, attach the new one into rx ring */
-- 2.22.0
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On 7/8/19 1:32 AM, Jian-Hong Pan wrote:
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..1bfc99ae6b84 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, rx_desc = skb->data; chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
* */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n")) {
new = skb;
goto next_rp;
This should probably be a WARN_ONCE() rather than WARN(), otherwise the logs will be flooded once this condition triggers.
Larry
Testing with RTL8822BE hardware, when available memory is low, we frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci] [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics, such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:105! invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new, data-sized skb first in RX ISR. After copying the data in, we pass it to the upper layers. However, if skb allocation fails, we effectively drop the frame. In both cases, the original, full size ring skb is reused.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Cc: stable@vger.kernel.org --- drivers/net/wireless/realtek/rtw88/pci.c | 49 +++++++++++------------- 1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..e9fe3ad896c8 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, u32 pkt_offset; u32 pkt_desc_sz = chip->rx_pkt_desc_sz; u32 buf_desc_sz = chip->rx_buf_desc_sz; + u32 new_len; u8 *rx_desc; dma_addr_t dma;
@@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
- if (pkt_stat.is_c2h) { - /* keep rx_desc, halmac needs it */ - skb_put(skb, pkt_stat.pkt_len + pkt_offset); + /* discard current skb if the new skb cannot be allocated as a + * new one in rx ring later + */ + new_len = pkt_stat.pkt_len + pkt_offset; + new = dev_alloc_skb(new_len); + if (WARN_ONCE(!new, "rx routine starvation\n")) + goto next_rp; + + /* put the DMA data including rx_desc from phy to new skb */ + skb_put_data(new, skb->data, new_len);
- /* pass offset for further operation */ - *((u32 *)skb->cb) = pkt_offset; - skb_queue_tail(&rtwdev->c2h_queue, skb); + if (pkt_stat.is_c2h) { + /* pass rx_desc & offset for further operation */ + *((u32 *)new->cb) = pkt_offset; + skb_queue_tail(&rtwdev->c2h_queue, new); ieee80211_queue_work(rtwdev->hw, &rtwdev->c2h_work); } else { - /* remove rx_desc, maybe use skb_pull? */ - skb_put(skb, pkt_stat.pkt_len); - skb_reserve(skb, pkt_offset); - - /* alloc a smaller skb to mac80211 */ - new = dev_alloc_skb(pkt_stat.pkt_len); - if (!new) { - new = skb; - } else { - skb_put_data(new, skb->data, skb->len); - dev_kfree_skb_any(skb); - } - /* TODO: merge into rx.c */ - rtw_rx_stats(rtwdev, pkt_stat.vif, skb); + /* remove rx_desc */ + skb_pull(new, pkt_offset); + + rtw_rx_stats(rtwdev, pkt_stat.vif, new); memcpy(new->cb, &rx_status, sizeof(rx_status)); ieee80211_rx_irqsafe(rtwdev->hw, new); }
- /* skb delivered to mac80211, alloc a new one in rx ring */ - new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE); - if (WARN(!new, "rx routine starvation\n")) - return; - - ring->buf[cur_rp] = new; - rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz); +next_rp: + /* new skb delivered to mac80211, re-enable original skb DMA */ + rtw_pci_reset_rx_desc(rtwdev, skb, ring, cur_rp, buf_desc_sz);
/* host read next element in ring */ if (++cur_rp >= ring->r.len)
Subject: [PATCH v2 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR
Testing with RTL8822BE hardware, when available memory is low, we frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci] [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics, such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:105! invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new, data-sized skb first in RX ISR. After copying the data in, we pass it to the upper layers. However, if skb allocation fails, we effectively drop the frame. In both cases, the original, full size ring skb is reused.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Cc: stable@vger.kernel.org
drivers/net/wireless/realtek/rtw88/pci.c | 49 +++++++++++------------- 1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..e9fe3ad896c8 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, u32 pkt_offset; u32 pkt_desc_sz = chip->rx_pkt_desc_sz; u32 buf_desc_sz = chip->rx_buf_desc_sz;
- u32 new_len; u8 *rx_desc; dma_addr_t dma;
@@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
if (pkt_stat.is_c2h) {
/* keep rx_desc, halmac needs it */
skb_put(skb, pkt_stat.pkt_len + pkt_offset);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
*/
new_len = pkt_stat.pkt_len + pkt_offset;
new = dev_alloc_skb(new_len);
if (WARN_ONCE(!new, "rx routine starvation\n"))
goto next_rp;
/* put the DMA data including rx_desc from phy to new skb */
skb_put_data(new, skb->data, new_len);
/* pass offset for further operation */
*((u32 *)skb->cb) = pkt_offset;
skb_queue_tail(&rtwdev->c2h_queue, skb);
if (pkt_stat.is_c2h) {
/* pass rx_desc & offset for further operation */
*((u32 *)new->cb) = pkt_offset;
} else {skb_queue_tail(&rtwdev->c2h_queue, new); ieee80211_queue_work(rtwdev->hw, &rtwdev->c2h_work);
/* remove rx_desc, maybe use skb_pull? */
skb_put(skb, pkt_stat.pkt_len);
skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data, skb->len);
dev_kfree_skb_any(skb);
}
/* TODO: merge into rx.c */
rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
/* remove rx_desc */
skb_pull(new, pkt_offset);
}rtw_rx_stats(rtwdev, pkt_stat.vif, new); memcpy(new->cb, &rx_status, sizeof(rx_status)); ieee80211_rx_irqsafe(rtwdev->hw, new);
/* skb delivered to mac80211, alloc a new one in rx ring */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n"))
return;
ring->buf[cur_rp] = new;
rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
+next_rp:
/* new skb delivered to mac80211, re-enable original skb DMA */
rtw_pci_reset_rx_desc(rtwdev, skb, ring, cur_rp, buf_desc_sz);
/* host read next element in ring */ if (++cur_rp >= ring->r.len)
-- 2.22.0
Now it looks good to me. Thanks.
Acked-by: Yan-Hsuan Chuang yhchuang@realtek.com
Yan-Hsuan
Tony Chuang yhchuang@realtek.com 於 2019年7月10日 週三 下午4:37寫道:
Subject: [PATCH v2 1/2] rtw88: pci: Rearrange the memory usage for skb in RX ISR
Testing with RTL8822BE hardware, when available memory is low, we frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci] [ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics, such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:105! invalid opcode: 0000 [#1] SMP NOPTI RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the function returns immediately without updating the RX ring. At this point, the RX ring may continue referencing an old skb which was already handed off to ieee80211_rx_irqsafe(). When it comes to be used again, bad things happen.
This patch allocates a new, data-sized skb first in RX ISR. After copying the data in, we pass it to the upper layers. However, if skb allocation fails, we effectively drop the frame. In both cases, the original, full size ring skb is reused.
In addition, to fixing the kernel crash, the RX routine should now generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053 Signed-off-by: Jian-Hong Pan jian-hong@endlessm.com Cc: stable@vger.kernel.org
drivers/net/wireless/realtek/rtw88/pci.c | 49 +++++++++++------------- 1 file changed, 22 insertions(+), 27 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c index cfe05ba7280d..e9fe3ad896c8 100644 --- a/drivers/net/wireless/realtek/rtw88/pci.c +++ b/drivers/net/wireless/realtek/rtw88/pci.c @@ -763,6 +763,7 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, u32 pkt_offset; u32 pkt_desc_sz = chip->rx_pkt_desc_sz; u32 buf_desc_sz = chip->rx_buf_desc_sz;
u32 new_len; u8 *rx_desc; dma_addr_t dma;
@@ -790,40 +791,34 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci, pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz + pkt_stat.shift;
if (pkt_stat.is_c2h) {
/* keep rx_desc, halmac needs it */
skb_put(skb, pkt_stat.pkt_len + pkt_offset);
/* discard current skb if the new skb cannot be allocated as a
* new one in rx ring later
*/
new_len = pkt_stat.pkt_len + pkt_offset;
new = dev_alloc_skb(new_len);
if (WARN_ONCE(!new, "rx routine starvation\n"))
goto next_rp;
/* put the DMA data including rx_desc from phy to new skb */
skb_put_data(new, skb->data, new_len);
/* pass offset for further operation */
*((u32 *)skb->cb) = pkt_offset;
skb_queue_tail(&rtwdev->c2h_queue, skb);
if (pkt_stat.is_c2h) {
/* pass rx_desc & offset for further operation */
*((u32 *)new->cb) = pkt_offset;
skb_queue_tail(&rtwdev->c2h_queue, new); ieee80211_queue_work(rtwdev->hw, &rtwdev->c2h_work); } else {
/* remove rx_desc, maybe use skb_pull? */
skb_put(skb, pkt_stat.pkt_len);
skb_reserve(skb, pkt_offset);
/* alloc a smaller skb to mac80211 */
new = dev_alloc_skb(pkt_stat.pkt_len);
if (!new) {
new = skb;
} else {
skb_put_data(new, skb->data, skb->len);
dev_kfree_skb_any(skb);
}
/* TODO: merge into rx.c */
rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
/* remove rx_desc */
skb_pull(new, pkt_offset);
rtw_rx_stats(rtwdev, pkt_stat.vif, new); memcpy(new->cb, &rx_status, sizeof(rx_status)); ieee80211_rx_irqsafe(rtwdev->hw, new); }
/* skb delivered to mac80211, alloc a new one in rx ring */
new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
if (WARN(!new, "rx routine starvation\n"))
return;
ring->buf[cur_rp] = new;
rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
+next_rp:
/* new skb delivered to mac80211, re-enable original skb DMA */
rtw_pci_reset_rx_desc(rtwdev, skb, ring, cur_rp, buf_desc_sz); /* host read next element in ring */ if (++cur_rp >= ring->r.len)
-- 2.22.0
Now it looks good to me. Thanks.
Acked-by: Yan-Hsuan Chuang yhchuang@realtek.com
Yan-Hsuan
Uh! Thanks for your ack. But I just sent version 3 patches (including [PATCH v3 2/2] rtw88: pci: Use DMA sync instead of remapping in RX ISR) by following Christoph's comment. [1]
Could you please also review the 2 patches of version 3? Thank you.
[1]: https://lkml.org/lkml/2019/7/9/507
Jian-Hong Pan
linux-stable-mirror@lists.linaro.org