On 23-07-19, 12:27, Rafael J. Wysocki wrote:
On Tue, Jul 23, 2019 at 11:15 AM Viresh Kumar viresh.kumar@linaro.org wrote:
Though there is one difference between intel_cpufreq and acpi_cpufreq,
intel_cpufreq has fast_switch_possible=true and so it uses slightly
different path in schedutil. I tried to look from that perspective as
well but couldn't find anything wrong.
acpi-cpufreq should use fast switching on the Doug's system too.
Ah okay.
If you still find intel_cpufreq to be broken, even with this patch,
please set fast_switch_possible=false instead of true in
__intel_pstate_cpu_init() and try tests again. That shall make it very
much similar to acpi-cpufreq driver.
I wonder if this helps. Even so, we want fast switching to be used by
intel_cpufreq.
With both using fast switching it shouldn't make any difference.
Anyway, it looks like the change reverted by the Doug's patch
introduced a race condition that had not been present before. Namely,
need_freq_update is cleared in get_next_freq() when it is set _or_
when the new freq is different from the cached one, so in the latter
case if it happens to be set by sugov_limits() after evaluating
sugov_should_update_freq() (which returned 'true' for timing reasons),
that update will be lost now. [Previously the update would not be
lost, because the clearing of need_freq_update depended only on its
current value.] Where it matters is that in the "need_freq_update set"
case, the "premature frequency reduction avoidance" should not be
applied (as you noticed and hence the $subject patch).
However, even with the $subject patch, need_freq_update may still be
set by sugov_limits() after the check added by it and then cleared by
get_next_freq(), so it doesn't really eliminate the problem.
IMO eliminating would require invalidating next_freq this way or
another when need_freq_update is set in sugov_should_update_freq(),
which was done before commit ecd2884291261e3fddbc7651ee11a20d596bb514.
Hmm, so to avoid locking in fast path we need two variable group to
protect against this kind of issues. I still don't want to override
next_freq with a special meaning as it can cause hidden bugs, we have
seen that earlier.
What about something like this then ?
--
viresh
-------------------------8<-------------------------
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index 636ca6f88c8e..2f382b0959e5 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -40,6 +40,7 @@ struct sugov_policy {
struct task_struct *thread;
bool work_in_progress;
+ bool limits_changed;
bool need_freq_update;
};
@@ -89,8 +90,11 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)
!cpufreq_this_cpu_can_update(sg_policy->policy))
return false;
- if (unlikely(sg_policy->need_freq_update))
+ if (unlikely(sg_policy->limits_changed)) {
+ sg_policy->limits_changed = false;
+ sg_policy->need_freq_update = true;
return true;
+ }
delta_ns = time - sg_policy->last_freq_update_time;
@@ -437,7 +441,7 @@ static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; }
static inline void ignore_dl_rate_limit(struct sugov_cpu *sg_cpu, struct sugov_policy *sg_policy)
{
if (cpu_bw_dl(cpu_rq(sg_cpu->cpu)) > sg_cpu->bw_dl)
- sg_policy->need_freq_update = true;
+ sg_policy->limits_changed = true;
}
static void sugov_update_single(struct update_util_data *hook, u64 time,
@@ -447,7 +451,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
unsigned long util, max;
unsigned int next_f;
- bool busy;
+ bool busy = false;
sugov_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
@@ -457,7 +461,9 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
if (!sugov_should_update_freq(sg_policy, time))
return;
- busy = sugov_cpu_is_busy(sg_cpu);
+ /* Limits may have changed, don't skip frequency update */
+ if (!sg_policy->need_freq_update)
+ busy = sugov_cpu_is_busy(sg_cpu);
util = sugov_get_util(sg_cpu);
max = sg_cpu->max;
@@ -831,6 +837,7 @@ static int sugov_start(struct cpufreq_policy *policy)
sg_policy->last_freq_update_time = 0;
sg_policy->next_freq = 0;
sg_policy->work_in_progress = false;
+ sg_policy->limits_changed = false;
sg_policy->need_freq_update = false;
sg_policy->cached_raw_freq = 0;
@@ -879,7 +886,7 @@ static void sugov_limits(struct cpufreq_policy *policy)
mutex_unlock(&sg_policy->work_lock);
}
- sg_policy->need_freq_update = true;
+ sg_policy->limits_changed = true;
}
struct cpufreq_governor schedutil_gov = {