On 2019.07.25 23:58 Viresh Kumar wrote:
On 25-07-19, 08:20, Doug Smythies wrote:
I tried the patch ("patch2"). It did not fix the issue.
To summarize, all kernel 5.2 based, all intel_cpufreq driver and schedutil governor:
Test: Does a busy system respond to maximum CPU clock frequency reduction?
stock, unaltered: No. revert ecd2884291261e3fddbc7651ee11a20d596bb514: Yes viresh patch: No. fast_switch edit: No. viresh patch2: No.
Hmm, so I tried to reproduce your setup on my ARM board.
- booted only with CPU0 so I hit the sugov_update_single() routine
- And applied below diff to make CPU look permanently busy:
-------------------------8<------------------------- diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 2f382b0959e5..afb47490e5dc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -121,6 +121,7 @@ static void sugov_fast_switch(struct sugov_policy *sg_policy, u64 time, if (!sugov_update_next_freq(sg_policy, time, next_freq)) return;
pr_info("%s: %d: %u\n", __func__, __LINE__, freq);
?? there is no "freq" variable here, and so this doesn't compile. However this works:
+ pr_info("%s: %d: %u\n", __func__, __LINE__, next_freq);
next_freq = cpufreq_driver_fast_switch(policy, next_freq); if (!next_freq) return;
@@ -424,14 +425,10 @@ static unsigned long sugov_iowait_apply(struct sugov_cpu *sg_cpu, u64 time, #ifdef CONFIG_NO_HZ_COMMON static bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) {
unsigned long idle_calls = tick_nohz_get_idle_calls_cpu(sg_cpu->cpu);
bool ret = idle_calls == sg_cpu->saved_idle_calls;
sg_cpu->saved_idle_calls = idle_calls;
return ret;
return true;
} #else -static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return false; } +static inline bool sugov_cpu_is_busy(struct sugov_cpu *sg_cpu) { return true; } #endif /* CONFIG_NO_HZ_COMMON */
/* @@ -565,6 +562,7 @@ static void sugov_work(struct kthread_work *work) sg_policy->work_in_progress = false; raw_spin_unlock_irqrestore(&sg_policy->update_lock, flags);
pr_info("%s: %d: %u\n", __func__, __LINE__, freq); mutex_lock(&sg_policy->work_lock); __cpufreq_driver_target(sg_policy->policy, freq, CPUFREQ_RELATION_L); mutex_unlock(&sg_policy->work_lock);
-------------------------8<-------------------------
Now, the frequency never gets down and so gets set to the maximum possible after a bit.
- Then I did:
echo <any-low-freq-value> > /sys/devices/system/cpu/cpufreq/policy0/scaling_max_freq
Without my patch applied: The print never gets printed and so frequency doesn't go down.
With my patch applied: The print gets printed immediately from sugov_work() and so the frequency reduces.
Can you try with this diff along with my Patch2 ? I suspect there may be something wrong with the intel_cpufreq driver as the patch fixes the only path we have in the schedutil governor which takes busyness of a CPU into account.
With this diff along with your patch2 There is never a print message from sugov_work. There are from sugov_fast_switch.
Note that for the intel_cpufreq CPU scaling driver and the schedutil governor I adjust the maximum clock frequency this way:
echo <any-low-percent> > /sys/devices/system/cpu/intel_pstate/max_perf_pct
I also applied the pr_info messages to the reverted kernel, and re-did my tests (where everything works as expected). There is never a print message from sugov_work. There are from sugov_fast_switch.
Notes:
I do not know if: /sys/devices/system/cpu/cpufreq/policy*/scaling_max_freq /sys/devices/system/cpu/cpufreq/policy*/scaling_min_freq Need to be accurate when using the intel_pstate driver in passive mode. They are not. The commit comment for 9083e4986124389e2a7c0ffca95630a4983887f0 suggests that they might need to be representative. I wonder if something similar to that commit is needed for other global changes, such as max_perf_pct and min_perf_pct?
intel_cpufreq/ondemand doesn't work properly on the reverted kernel. (just discovered, not investigated) I don't know about other governors.
... Doug