From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 6acc72a43eac78a309160d0a7512bbc59bcdd757 ]
The blamed commit changed struct ocelot_stat_layout :: "u32 offset" to "u32 reg".
However, "u32 reg" is not quite a register address, but an enum ocelot_reg, which in itself encodes an enum ocelot_target target in the upper bits, and an index into the ocelot->map[target][] array in the lower bits.
So, whereas the previous code comparison between stats_layout[i].offset and last + 1 was correct (because those "offsets" at the time were 32-bit relative addresses), the new code, comparing layout[i].reg to last + 4 is not correct, because the "reg" here is an enum/index, not an actual register address.
What we want to compare are indeed register addresses, but to do that, we need to actually go through the same motions as __ocelot_bulk_read_ix() itself.
With this bug, all statistics counters are deemed by ocelot_prepare_stats_regions() as constituting their own region. (Truncated) log on VSC9959 (Felix) below (prints added by me):
Before:
region of 1 contiguous counters starting with SYS:STAT:CNT[0x000] region of 1 contiguous counters starting with SYS:STAT:CNT[0x001] region of 1 contiguous counters starting with SYS:STAT:CNT[0x002] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x041] region of 1 contiguous counters starting with SYS:STAT:CNT[0x042] region of 1 contiguous counters starting with SYS:STAT:CNT[0x080] region of 1 contiguous counters starting with SYS:STAT:CNT[0x081] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x0ac] region of 1 contiguous counters starting with SYS:STAT:CNT[0x100] region of 1 contiguous counters starting with SYS:STAT:CNT[0x101] ... region of 1 contiguous counters starting with SYS:STAT:CNT[0x111]
After:
region of 67 contiguous counters starting with SYS:STAT:CNT[0x000] region of 45 contiguous counters starting with SYS:STAT:CNT[0x080] region of 18 contiguous counters starting with SYS:STAT:CNT[0x100]
Since commit d87b1c08f38a ("net: mscc: ocelot: use bulk reads for stats") intended bulking as a performance improvement, and since now, with trivial-sized regions, performance is even worse than without bulking at all, this could easily qualify as a performance regression.
Fixes: d4c367650704 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Acked-by: Colin Foster colin.foster@in-advantage.com Tested-by: Colin Foster colin.foster@in-advantage.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot_stats.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_stats.c b/drivers/net/ethernet/mscc/ocelot_stats.c index 1478c3b21af15..cb196775489e2 100644 --- a/drivers/net/ethernet/mscc/ocelot_stats.c +++ b/drivers/net/ethernet/mscc/ocelot_stats.c @@ -611,7 +611,8 @@ static int ocelot_prepare_stats_regions(struct ocelot *ocelot) if (!ocelot_stats_layout[i].reg) continue;
- if (region && ocelot_stats_layout[i].reg == last + 4) { + if (region && ocelot->map[SYS][ocelot_stats_layout[i].reg & REG_MASK] == + ocelot->map[SYS][last & REG_MASK] + 4) { region->count++; } else { region = devm_kzalloc(ocelot->dev, sizeof(*region),