Testing with RTL8822BE hardware, when available memory is low, we
frequently see a kernel panic and system freeze.
First, rtw_pci_rx_isr encounters a memory allocation failure (trimmed):
rx routine starvation
WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
[ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
Then we see a variety of different error conditions and kernel panics,
such as this one (trimmed):
rtw_pci 0000:02:00.0: pci bus timeout, check dma status
skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL>
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:105!
invalid opcode: 0000 [#1] SMP NOPTI
RIP: 0010:skb_panic+0x43/0x45
When skb allocation fails and the "rx routine starvation" is hit, the
function returns immediately without updating the RX ring. At this
point, the RX ring may continue referencing an old skb which was already
handed off to ieee80211_rx_irqsafe(). When it comes to be used again,
bad things happen.
This patch allocates a new skb first in RX ISR. If we don't have memory
available, we discard the current frame, allowing the existing skb to be
reused in the ring. Otherwise, we simplify the code flow and just hand
over the RX-populated skb over to mac80211.
In addition, to fixing the kernel crash, the RX routine should now
generally behave better under low memory conditions.
Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=204053
Signed-off-by: Jian-Hong Pan <jian-hong(a)endlessm.com>
Reviewed-by: Daniel Drake <drake(a)endlessm.com>
Cc: <stable(a)vger.kernel.org>
---
drivers/net/wireless/realtek/rtw88/pci.c | 28 +++++++++++-------------
1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/net/wireless/realtek/rtw88/pci.c b/drivers/net/wireless/realtek/rtw88/pci.c
index cfe05ba7280d..1bfc99ae6b84 100644
--- a/drivers/net/wireless/realtek/rtw88/pci.c
+++ b/drivers/net/wireless/realtek/rtw88/pci.c
@@ -786,6 +786,15 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci,
rx_desc = skb->data;
chip->ops->query_rx_desc(rtwdev, rx_desc, &pkt_stat, &rx_status);
+ /* discard current skb if the new skb cannot be allocated as a
+ * new one in rx ring later
+ * */
+ new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
+ if (WARN(!new, "rx routine starvation\n")) {
+ new = skb;
+ goto next_rp;
+ }
+
/* offset from rx_desc to payload */
pkt_offset = pkt_desc_sz + pkt_stat.drv_info_sz +
pkt_stat.shift;
@@ -803,25 +812,14 @@ static void rtw_pci_rx_isr(struct rtw_dev *rtwdev, struct rtw_pci *rtwpci,
skb_put(skb, pkt_stat.pkt_len);
skb_reserve(skb, pkt_offset);
- /* alloc a smaller skb to mac80211 */
- new = dev_alloc_skb(pkt_stat.pkt_len);
- if (!new) {
- new = skb;
- } else {
- skb_put_data(new, skb->data, skb->len);
- dev_kfree_skb_any(skb);
- }
/* TODO: merge into rx.c */
rtw_rx_stats(rtwdev, pkt_stat.vif, skb);
- memcpy(new->cb, &rx_status, sizeof(rx_status));
- ieee80211_rx_irqsafe(rtwdev->hw, new);
+ memcpy(skb->cb, &rx_status, sizeof(rx_status));
+ ieee80211_rx_irqsafe(rtwdev->hw, skb);
}
- /* skb delivered to mac80211, alloc a new one in rx ring */
- new = dev_alloc_skb(RTK_PCI_RX_BUF_SIZE);
- if (WARN(!new, "rx routine starvation\n"))
- return;
-
+next_rp:
+ /* skb delivered to mac80211, attach the new one into rx ring */
ring->buf[cur_rp] = new;
rtw_pci_reset_rx_desc(rtwdev, new, ring, cur_rp, buf_desc_sz);
--
2.22.0