On Fri, Aug 2, 2019 at 7:44 AM Viresh Kumar viresh.kumar@linaro.org wrote:
Intel pstate driver exposes min_perf_pct and max_perf_pct sysfs files, which can be used to force a limit on the min/max P state of the driver. Though these files eventually control the min/max frequencies that the CPUs will run at, they don't make a change to policy->min/max values.
That's correct.
When the values of these files are changed (in passive mode of the driver), it leads to calling ->limits() callback of the cpufreq governors, like schedutil. On a call to it the governors shall forcefully update the frequency to come within the limits.
OK, so the problem is that it is a bug to invoke the governor's ->limits() callback without updating policy->min/max, because that's what "limits" mean to the governors.
Fair enough.
For getting the value within limits, the schedutil governor calls cpufreq_driver_resolve_freq(), which eventually tries to call ->resolve_freq() callback for this driver. Since the callback isn't present, the schedutil governor fails to get the target freq within limit and sometimes aborts the update believing that the frequency is already set to the target value.
This patch implements the resolve_freq() callback, so the correct target frequency can be returned by the driver and the schedutil governor gets the frequency within limits immediately.
So the problem is that ->resolve_freq() adds overhead and it adds that overhead even if the limits don't change. It just sits there and computes things every time even if that is completely redundant.
So no, this is not the right way to fix it IMO.