[TLDR: I'm adding this regression report to the list of tracked regressions; all text from me you find below is based on a few templates paragraphs you might have encountered already already in similar form.]
TWIMC: this mail is primarily send for documentation purposes and for regzbot, my Linux kernel regression tracking bot. These mails usually contain '#forregzbot' in the subject, to make them easy to spot and filter.
Hi, this is your Linux kernel regression tracker.
On 01.07.22 13:11, Kajetan Puchalski wrote:
Hi,
While running the udp-flood test from stress-ng on Ampere Altra (Mt. Jade platform) I encountered a kernel panic caused by NULL pointer dereference within nf_conntrack.
The issue is present in the latest mainline (5.19-rc4), latest stable (5.18.8), as well as multiple older stable versions. The last working stable version I found was 5.15.40.
Through bisecting I've traced the issue back to mainline commit 719774377622bc4025d2a74f551b5dc2158c6c30 (netfilter: conntrack: convert to refcount_t api), on kernels from before this commit the test runs fine. As far as I can tell, this commit was included in stable with version 5.15.41, thus causing the regression compared to 5.15.40. It was included in the mainline with version 5.16.
FWIW, looks like it was merged for v5.17-rc1 $ git describe --contains --tags 719774377622bc4025
v5.17-rc1~170^2~24^2~19
The issue is very consistently reproducible as well, running this command resulted in the same kernel panic every time I tried it on different kernels from after the change in question was merged.
stress-ng --udp-flood 0 -t 1m --metrics-brief --perf
The commit was not easily revertible so I can't say whether reverting it on the latest mainline would fix the problem or not.
[...]
The distirbution is Ubuntu 20.04.3 LTS, the architecture is aarch64.
Please let me know if I can provide any more details or try any more tests.
Thanks for the report. To be sure below issue doesn't fall through the cracks unnoticed, I'm adding it to regzbot, my Linux kernel regression tracking bot:
#regzbot ^introduced 719774377622bc402 #regzbot title net: netfilter: stress-ng udp-flood causes kernel panic on Ampere Altra
This isn't a regression? This issue or a fix for it are already discussed somewhere else? It was fixed already? You want to clarify when the regression started to happen? Or point out I got the title or something else totally wrong? Then just reply -- ideally with also telling regzbot about it, as explained here: https://linux-regtracking.leemhuis.info/tracked-regression/
Reminder for developers: When fixing the issue, add 'Link:' tags pointing to the report (the mail this one replies to), as explained for in the Linux kernel's documentation; above webpage explains why this is important for tracked regressions.
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of reports and sometimes miss something important when writing mails like this. If that's the case here, don't hesitate to tell me in a public reply, it's in everyone's interest to set the public record straight.