On Fri, Oct 12, 2018 at 07:10:55PM -0700, Ivan Delalande wrote:
Hi Greg,
This series fixes issues we've seen with softirq time accounting in 4.9:
- when ksoftirqd is running at 100% on a CPU, none of the values reported by /proc/stat for that CPU will change, sometimes for dozens of seconds,
- large deviations in the total number of ticks accumulated over a fixed time for a CPU, probably because of the first issue hitting for shorter periods.
We found out that something pretty similar had been reported 9 months ago, see the reference link below. In that discussion, Rabin Vincent had made a 4.9 specific patch which fixes our first issue, but we were still seeing some deviation from the total number of ticks (up to 1.7% from expected, where we had only 0.2% on older kernels), and you had also asked for a direct backport from the mainline series, if possible.
As mentioned in that thread, a lot of changes (probably 50+) went into 4.11 to remove cputime, but we could get something working with only the 4 attached patches to fix these two issues. Three of these patches apply without change, and the second one in the series ("sched/cputime: Convert kcpustat to nsecs") needed a minor change as a cast had been added in 527b0a76f41d ("sched/cpuacct: Avoid %lld seq_printf warning") to fix a build warning on s390. I guess we could also include that patch in this series, let me know if this is the preferred way to handle this.
We ran our tests on 3.18, 4.4 and 4.9 and confirmed that only 4.9 would need this series, and that this series indeed restores the behavior we were seeing on those older kernels.
Thanks!
Reference: http://lkml.kernel.org/r/%3C1513159876-5125-1-git-send-email-rabin.vincent@a...
v2: - drop "time: Introduce jiffies64_to_nsecs()" as it has already been merged into v4.9.132, - include backport of commit 564b733c899f ("macintosh/rack-meter: Convert cputime64_t use to u64") to avoid introducing a build failure on powerpc.
Ok, let's try this again, all queued up.
greg k-h