On Tue, Jul 23, 2019 at 11:15 AM Viresh Kumar viresh.kumar@linaro.org wrote:
On 23-07-19, 00:10, Doug Smythies wrote:
On 2019.07.21 23:52 Viresh Kumar wrote:
To avoid reducing the frequency of a CPU prematurely, we skip reducing the frequency if the CPU had been busy recently.
This should not be done when the limits of the policy are changed, for example due to thermal throttling. We should always get the frequency within limits as soon as possible.
Fixes: ecd288429126 ("cpufreq: schedutil: Don't set next_freq to UINT_MAX") Cc: v4.18+ stable@vger.kernel.org # v4.18+ Reported-by: Doug Smythies doug.smythies@gmail.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
@Doug: Please try this patch, it must fix the issue you reported.
It fixes the driver = acpi-cpufreq ; governor = schedutil test case It does not fix the driver = intel_cpufreq ; governor = schedutil test case
I have checked my results twice, but will check again in the day or two.
The patch you tried to revert wasn't doing any driver specific stuff but only schedutil. If that revert fixes your issue with both the drivers, then this patch should do it as well.
I am clueless now on what can go wrong with intel_cpufreq driver with schedutil now.
Though there is one difference between intel_cpufreq and acpi_cpufreq, intel_cpufreq has fast_switch_possible=true and so it uses slightly different path in schedutil. I tried to look from that perspective as well but couldn't find anything wrong.
acpi-cpufreq should use fast switching on the Doug's system too.
If you still find intel_cpufreq to be broken, even with this patch, please set fast_switch_possible=false instead of true in __intel_pstate_cpu_init() and try tests again. That shall make it very much similar to acpi-cpufreq driver.
I wonder if this helps. Even so, we want fast switching to be used by intel_cpufreq.
Anyway, it looks like the change reverted by the Doug's patch introduced a race condition that had not been present before. Namely, need_freq_update is cleared in get_next_freq() when it is set _or_ when the new freq is different from the cached one, so in the latter case if it happens to be set by sugov_limits() after evaluating sugov_should_update_freq() (which returned 'true' for timing reasons), that update will be lost now. [Previously the update would not be lost, because the clearing of need_freq_update depended only on its current value.] Where it matters is that in the "need_freq_update set" case, the "premature frequency reduction avoidance" should not be applied (as you noticed and hence the $subject patch).
However, even with the $subject patch, need_freq_update may still be set by sugov_limits() after the check added by it and then cleared by get_next_freq(), so it doesn't really eliminate the problem.
IMO eliminating would require invalidating next_freq this way or another when need_freq_update is set in sugov_should_update_freq(), which was done before commit ecd2884291261e3fddbc7651ee11a20d596bb514.