On Tue, 30 Jan 2024 13:00:10 +0800 Jisheng Zhang jszhang@kernel.org wrote:
On Sun, Jan 28, 2024 at 08:35:29PM +0100, Petr Tesarik wrote:
As explained by a comment in <linux/u64_stats_sync.h>, write side of struct u64_stats_sync must ensure mutual exclusion, or one seqcount update could be lost on 32-bit platforms, thus blocking readers forever. Such lockups have been observed in real world after stmmac_xmit() on one CPU raced with stmmac_napi_poll_tx() on another CPU.
To fix the issue without introducing a new lock, split the statics into three parts:
- fields updated only under the tx queue lock,
- fields updated only during NAPI poll,
- fields updated only from interrupt context,
Updates to fields in the first two groups are already serialized through other locks. It is sufficient to split the existing struct u64_stats_sync so that each group has its own.
Note that tx_set_ic_bit is updated from both contexts. Split this counter so that each context gets its own, and calculate their sum to get the total value in stmmac_get_ethtool_stats().
For the third group, multiple interrupts may be processed by different CPUs at the same time, but interrupts on the same CPU will not nest. Move fields from this group to a newly created per-cpu struct stmmac_pcpu_stats.
Fixes: 133466c3bbe1 ("net: stmmac: use per-queue 64 bit statistics where necessary") Link: https://lore.kernel.org/netdev/Za173PhviYg-1qIn@torres.zugschlus.de/t/ Cc: stable@vger.kernel.org Signed-off-by: Petr Tesarik petr@tesarici.cz
Thanks for the fix patch. One trivial improviement below s/netdev_alloc_pcpu_stats/devm_netdev_alloc_pcpu_stats to simplify error and exit code path.
Thanks for your review.
In fact, many other allocations in stmmac could be converted to devm_*. I wanted to stay consistent with the existing code, but hey, you're right there's no good reason for it.
Plus, I can send convert the other places with another patch.
With that: Reviewed-by: Jisheng Zhang jszhang@kernel.org
PS: when I sent the above "net: stmmac: use per-queue 64 bit statistics where necessary", I had an impression: there are too much statistics in stmmac driver, I didn't see so many statistics in other eth drivers, is it possible to remove some useless or not that useful statistic members?
I don't feel authorized to make the decision. But I also wonder about some counters. For example, why is there tx_packets and tx_pkt_n? The former is shown as RX packets by "ip stats show dev end0", the latter is shown by as tx_pkt_n by "ethtools -S end0". The values do differ, but I have no clue why, and if they are even expected to be different or if it's a bug.
Petr T