freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- Hi Guys,
This needs to go in 5.12-rc.
Thara, please give this a try and give your tested-by :).
drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpufreq_cooling.c b/drivers/thermal/cpufreq_cooling.c index f5af2571f9b7..10af3341e5ea 100644 --- a/drivers/thermal/cpufreq_cooling.c +++ b/drivers/thermal/cpufreq_cooling.c @@ -485,7 +485,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, frequency = get_state_freq(cpufreq_cdev, state);
ret = freq_qos_update_request(&cpufreq_cdev->qos_req, frequency); - if (ret > 0) { + if (ret >= 0) { cpufreq_cdev->cpufreq_state = state; cpus = cpufreq_cdev->policy->cpus; max_capacity = arch_scale_cpu_capacity(cpumask_first(cpus));
Hi Viresh,
On 2/17/21 5:48 AM, Viresh Kumar wrote:
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
To be precised, thermal pressure signal is not so important in this mechanism and the 'cpufreq_state' has changed recently:
236761f19a4f373354 thermal/drivers/cpufreq_cooling: Update cpufreq_state only if state has changed
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping")
I'm not sure if that f12e4f is the root cause.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Hi Guys,
This needs to go in 5.12-rc.
Thara, please give this a try and give your tested-by :).
drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Anyway, the fix LGTM. I will have to make sure that I'm CC'ed for these topic, so I can have a look (I missed somehow 236761f19)
Reviewed-by: Lukasz Luba lukasz.luba@arm.com Tested-by: Lukasz Luba lukasz.luba@arm.com
Regards, Lukasz
On 17-02-21, 10:29, Lukasz Luba wrote:
On 2/17/21 5:48 AM, Viresh Kumar wrote:
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
To be precised, thermal pressure signal is not so important in this mechanism and the 'cpufreq_state' has changed recently:
Right, I wasn't concerned only about no thermal cooling, but both thermal cooling and pressure.
236761f19a4f373354 thermal/drivers/cpufreq_cooling: Update cpufreq_state only if state has changed
This moved the assignment to a more logical place for me, i.e. not to do that on errors, just that the block in which it landed may not get called at all :(
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping")
I'm not sure if that f12e4f is the root cause.
Hmm, depends on how we define the problem :)
If this was just about thermal-cooling not happening, then may be yes, but to me it is rather about mishandled return value of freq_qos_update_request() which has more than one side effects and so I went for the main commit.
This is also important as f12e4f66ab6a got merged in 5.7 and 236761f19 merged in 5.11 and this patch needs to get applied in stable kernels since 5.7 to fix it all.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Hi Guys,
This needs to go in 5.12-rc.
Thara, please give this a try and give your tested-by :).
drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Anyway, the fix LGTM. I will have to make sure that I'm CC'ed for these topic, so I can have a look (I missed somehow 236761f19)
Reviewed-by: Lukasz Luba lukasz.luba@arm.com Tested-by: Lukasz Luba lukasz.luba@arm.com
Thanks.
On 2/17/21 10:39 AM, Viresh Kumar wrote:
On 17-02-21, 10:29, Lukasz Luba wrote:
On 2/17/21 5:48 AM, Viresh Kumar wrote:
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
To be precised, thermal pressure signal is not so important in this mechanism and the 'cpufreq_state' has changed recently:
Right, I wasn't concerned only about no thermal cooling, but both thermal cooling and pressure.
236761f19a4f373354 thermal/drivers/cpufreq_cooling: Update cpufreq_state only if state has changed
This moved the assignment to a more logical place for me, i.e. not to do that on errors, just that the block in which it landed may not get called at all :(
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping")
I'm not sure if that f12e4f is the root cause.
Hmm, depends on how we define the problem :)
If this was just about thermal-cooling not happening, then may be yes, but to me it is rather about mishandled return value of freq_qos_update_request() which has more than one side effects and so I went for the main commit.
This is also important as f12e4f66ab6a got merged in 5.7 and 236761f19 merged in 5.11 and this patch needs to get applied in stable kernels since 5.7 to fix it all.
'to fix it all' - I agree
On Wed, Feb 17, 2021 at 6:50 AM Viresh Kumar viresh.kumar@linaro.org wrote:
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
Hi Guys,
This needs to go in 5.12-rc.
Thara, please give this a try and give your tested-by :).
drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpufreq_cooling.c b/drivers/thermal/cpufreq_cooling.c index f5af2571f9b7..10af3341e5ea 100644 --- a/drivers/thermal/cpufreq_cooling.c +++ b/drivers/thermal/cpufreq_cooling.c @@ -485,7 +485,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, frequency = get_state_freq(cpufreq_cdev, state);
ret = freq_qos_update_request(&cpufreq_cdev->qos_req, frequency);
if (ret > 0) {
if (ret >= 0) { cpufreq_cdev->cpufreq_state = state; cpus = cpufreq_cdev->policy->cpus; max_capacity = arch_scale_cpu_capacity(cpumask_first(cpus));
-- 2.25.0.rc1.19.g042ed3e048af
On 2/17/21 12:48 AM, Viresh Kumar wrote:
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
Hi Guys,
This needs to go in 5.12-rc.
Thara, please give this a try and give your tested-by :).
It fixes the thermal runaway issue on sdm845 that I had reported. So,
Tested-by: Thara Gopinaththara.gopinath@linaro.org
drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpufreq_cooling.c b/drivers/thermal/cpufreq_cooling.c index f5af2571f9b7..10af3341e5ea 100644 --- a/drivers/thermal/cpufreq_cooling.c +++ b/drivers/thermal/cpufreq_cooling.c @@ -485,7 +485,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, frequency = get_state_freq(cpufreq_cdev, state); ret = freq_qos_update_request(&cpufreq_cdev->qos_req, frequency);
- if (ret > 0) {
- if (ret >= 0) { cpufreq_cdev->cpufreq_state = state; cpus = cpufreq_cdev->policy->cpus; max_capacity = arch_scale_cpu_capacity(cpumask_first(cpus));
The following commit has been merged into the thermal/next branch of thermal:
Commit-ID: a51afb13311cd85b2f638c691b2734622277d8f5 Gitweb: https://git.kernel.org/pub/scm/linux/kernel/git/thermal/linux.git//a51afb133... Author: Viresh Kumar viresh.kumar@linaro.org AuthorDate: Wed, 17 Feb 2021 11:18:58 +05:30 Committer: Daniel Lezcano daniel.lezcano@linaro.org CommitterDate: Wed, 17 Feb 2021 18:53:19 +01:00
thermal: cpufreq_cooling: freq_qos_update_request() returns < 0 on error
freq_qos_update_request() returns 1 if the effective constraint value has changed, 0 if the effective constraint value has not changed, or a negative error code on failures.
The frequency constraints for CPUs can be set by different parts of the kernel. If the maximum frequency constraint set by other parts of the kernel are set at a lower value than the one corresponding to cooling state 0, then we will never be able to cool down the system as freq_qos_update_request() will keep on returning 0 and we will skip updating cpufreq_state and thermal pressure.
Fix that by doing the updates even in the case where freq_qos_update_request() returns 0, as we have effectively set the constraint to a new value even if the consolidated value of the actual constraint is unchanged because of external factors.
Cc: v5.7+ stable@vger.kernel.org # v5.7+ Reported-by: Thara Gopinath thara.gopinath@linaro.org Fixes: f12e4f66ab6a ("thermal/cpu-cooling: Update thermal pressure in case of a maximum frequency capping") Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Reviewed-by: Lukasz Luba lukasz.luba@arm.com Tested-by: Lukasz Luba lukasz.luba@arm.com Reviewed-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Tested-by: Thara Gopinaththara.gopinath@linaro.org Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/b2b7e84944937390256669df5a48ce5abba0c1ef.161354071... --- drivers/thermal/cpufreq_cooling.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/thermal/cpufreq_cooling.c b/drivers/thermal/cpufreq_cooling.c index 612f063..ddc166e 100644 --- a/drivers/thermal/cpufreq_cooling.c +++ b/drivers/thermal/cpufreq_cooling.c @@ -441,7 +441,7 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev, frequency = get_state_freq(cpufreq_cdev, state);
ret = freq_qos_update_request(&cpufreq_cdev->qos_req, frequency); - if (ret > 0) { + if (ret >= 0) { cpufreq_cdev->cpufreq_state = state; cpus = cpufreq_cdev->policy->cpus; max_capacity = arch_scale_cpu_capacity(cpumask_first(cpus));
linux-stable-mirror@lists.linaro.org