On Sat, Mar 05, 2022 at 09:44:27AM -0700, dann frazier wrote:
The LTP cpuset_sched_domains test, authored by Miao Xie, fails on a Kunpeng920 server that has 4 NUMA nodes: https://launchpad.net/bugs/1951289
This does appear to be a real bug. /proc/schedstat displays 4 domain levels for CPUs on 2 of the nodes, but only 3 levels for the others 2 (see below). I assume this means the scheduler is making suboptimal decisions about where to place/move processes. I'm not sure how to demonstrate that - but open to suggestions if that evidence is important justification for stable.
This is not a problem in current upstream kernels, so I bisected and found that the first patch here fixes it. I can't tell from the commit message if fixing this case was Valentin's intent, or just a happy side-effect of the set conversion. The other two patches fix regressions introduced by the first. All cherry-pick cleanly back to 5.10.y and 5.4.y. This platform easily reproduces the problem Dietmar's fix addresses. I don't have hardware to test the ia64 fix.
Note: This also impacts earlier stable trees, but require some minor porting, so I'll submit fixes for those separately.
Here's a comparison of /proc/schedstat before & after applying these fixes:
Thanks, now queued up.
greg k-h