On Tue, Jun 13, 2023 at 11:58:05AM -0700, Bhatnagar, Rishabh wrote:
On 6/13/23 11:49 AM, Bhatnagar, Rishabh wrote:
Hi Sebastian/Greg
We are seeing RCU stall warnings from recent stable tree updates: 5.4.243, 5.10.180, 5.15.113, 6.1.31 onwards. This is seen in the upstream stable trees without any downstream patches.
The issue is seen few minutes after booting without any workload. We launch hundred's of virtual instances and this shows up in 1-2 instances, so its hard to reproduce. Attaching a few stack traces below.
The issue can be seen on virtual and baremetal instances. Another interesting point is we only see this on x86 based instances. We also did test this on linux-mainline but were not able to reproduce the issue. So maybe there's a fixup or related commit that has gone in?
We tried bisecting the stable trees and found that after reverting the below commit we couldn't reproduce this in any of the kernels consistently.
tick/common: Align tick period with the HZ tick. [ Upstream commit e9523a0d81899361214d118ad60ef76f0e92f71d ]
Not exactly sure how this commit is affecting all stable kernels. Can you take a look at this issue and share your insight?
Does this issue also show up in 6.3.y and in 6.4-rc5?
thanks,
greg k-h