On Wed, Nov 29, 2023 at 6:32 PM Peter Zijlstra peterz@infradead.org wrote:
On Wed, Nov 29, 2023 at 05:33:33PM +0800, Zhengyuan Liu wrote:
Hi, all
We are encountering a perf related soft lockup as shown below:
[25023823.265138] watchdog: BUG: soft lockup - CPU#29 stuck for 45s! [YD:3284696] [25023823.275772] net_failover virtio_scsi failover [25023823.276750] CPU: 29 PID: 3284696 Comm: YD Kdump: loaded Not tainted 4.19.90-23.18.v2101.ky10.aarch64 #1
^^^^^^^^^^^^^^^^^^^
That is some unholy ancient kernel. Please see if you can reproduce on something recent.
Sorry for the late reply since my company mail server has some trouble.
I don't have a reproducer, It's an online server and happens once every few months. From our analysis, the recent kernel shouldn't have this problem after commit bd27568117664(“perf: Rewrite core context handling”). But LTS branches such as v4.19 and v5.4 will be used for a long time, so I think it's worth fixing this problem.
Thanks,