The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y git checkout FETCH_HEAD git cherry-pick -x 8e47363588377e1bdb65e2b020b409cfb44dd260 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to 'stable@vger.kernel.org' --in-reply-to '167819253422103@kroah.com' --subject-prefix 'PATCH 4.14.y' HEAD^..
Possible dependencies:
8e4736358837 ("thermal: intel: powerclamp: Fix cur_state for multi package system")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8e47363588377e1bdb65e2b020b409cfb44dd260 Mon Sep 17 00:00:00 2001 From: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Date: Wed, 1 Feb 2023 12:39:41 -0800 Subject: [PATCH] thermal: intel: powerclamp: Fix cur_state for multi package system
The powerclamp cooling device cur_state shows actual idle observed by package C-state idle counters. But the implementation is not sufficient for multi package or multi die system. The cur_state value is incorrect. On these systems, these counters must be read from each package/die and somehow aggregate them. But there is no good method for aggregation.
It was not a problem when explicit CPU model addition was required to enable intel powerclamp. In this way certain CPU models could have been avoided. But with the removal of CPU model check with the availability of Package C-state counters, the driver is loaded on most of the recent systems.
For multi package/die systems, just show the actual target idle state, the system is trying to achieve. In powerclamp this is the user set state minus one.
Also there is no use of starting a worker thread for polling package C-state counters and applying any compensation for multiple package or multiple die systems.
Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist") Signed-off-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: 4.14+ stable@vger.kernel.org # 4.14+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c index b80e25ec1261..2f4cbfdf26a0 100644 --- a/drivers/thermal/intel/intel_powerclamp.c +++ b/drivers/thermal/intel/intel_powerclamp.c @@ -57,6 +57,7 @@
static unsigned int target_mwait; static struct dentry *debug_dir; +static bool poll_pkg_cstate_enable;
/* user selected target */ static unsigned int set_target_ratio; @@ -261,6 +262,9 @@ static unsigned int get_compensation(int ratio) { unsigned int comp = 0;
+ if (!poll_pkg_cstate_enable) + return 0; + /* we only use compensation if all adjacent ones are good */ if (ratio == 1 && cal_data[ratio].confidence >= CONFIDENCE_OK && @@ -519,7 +523,8 @@ static int start_power_clamp(void) control_cpu = cpumask_first(cpu_online_mask);
clamping = true; - schedule_delayed_work(&poll_pkg_cstate_work, 0); + if (poll_pkg_cstate_enable) + schedule_delayed_work(&poll_pkg_cstate_work, 0);
/* start one kthread worker per online cpu */ for_each_online_cpu(cpu) { @@ -585,11 +590,15 @@ static int powerclamp_get_max_state(struct thermal_cooling_device *cdev, static int powerclamp_get_cur_state(struct thermal_cooling_device *cdev, unsigned long *state) { - if (true == clamping) - *state = pkg_cstate_ratio_cur; - else + if (clamping) { + if (poll_pkg_cstate_enable) + *state = pkg_cstate_ratio_cur; + else + *state = set_target_ratio; + } else { /* to save power, do not poll idle ratio while not clamping */ *state = -1; /* indicates invalid state */ + }
return 0; } @@ -712,6 +721,9 @@ static int __init powerclamp_init(void) goto exit_unregister; }
+ if (topology_max_packages() == 1 && topology_max_die_per_package() == 1) + poll_pkg_cstate_enable = true; + cooling_dev = thermal_cooling_device_register("intel_powerclamp", NULL, &powerclamp_cooling_ops); if (IS_ERR(cooling_dev)) {
The powerclamp cooling device cur_state shows actual idle observed by package C-state idle counters. But the implementation is not sufficient for multi package or multi die system. The cur_state value is incorrect. On these systems, these counters must be read from each package/die and somehow aggregate them. But there is no good method for aggregation.
It was not a problem when explicit CPU model addition was required to enable intel powerclamp. In this way certain CPU models could have been avoided. But with the removal of CPU model check with the availability of Package C-state counters, the driver is loaded on most of the recent systems.
For multi package/die systems, just show the actual target idle state, the system is trying to achieve. In powerclamp this is the user set state minus one.
Also there is no use of starting a worker thread for polling package C-state counters and applying any compensation for multiple package or multiple die systems.
Fixes: b721ca0d1927 ("thermal/powerclamp: remove cpu whitelist") Signed-off-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: 4.14+ stable@vger.kernel.org # 4.14+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com (cherry picked from commit 8e47363588377e1bdb65e2b020b409cfb44dd260) --- drivers/thermal/intel_powerclamp.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/thermal/intel_powerclamp.c b/drivers/thermal/intel_powerclamp.c index 68cc88637e06..1f79eca3754e 100644 --- a/drivers/thermal/intel_powerclamp.c +++ b/drivers/thermal/intel_powerclamp.c @@ -72,6 +72,7 @@
static unsigned int target_mwait; static struct dentry *debug_dir; +static bool poll_pkg_cstate_enable;
/* user selected target */ static unsigned int set_target_ratio; @@ -280,6 +281,9 @@ static unsigned int get_compensation(int ratio) { unsigned int comp = 0;
+ if (!poll_pkg_cstate_enable) + return 0; + /* we only use compensation if all adjacent ones are good */ if (ratio == 1 && cal_data[ratio].confidence >= CONFIDENCE_OK && @@ -552,7 +556,8 @@ static int start_power_clamp(void) control_cpu = cpumask_first(cpu_online_mask);
clamping = true; - schedule_delayed_work(&poll_pkg_cstate_work, 0); + if (poll_pkg_cstate_enable) + schedule_delayed_work(&poll_pkg_cstate_work, 0);
/* start one kthread worker per online cpu */ for_each_online_cpu(cpu) { @@ -621,11 +626,15 @@ static int powerclamp_get_max_state(struct thermal_cooling_device *cdev, static int powerclamp_get_cur_state(struct thermal_cooling_device *cdev, unsigned long *state) { - if (true == clamping) - *state = pkg_cstate_ratio_cur; - else + if (clamping) { + if (poll_pkg_cstate_enable) + *state = pkg_cstate_ratio_cur; + else + *state = set_target_ratio; + } else { /* to save power, do not poll idle ratio while not clamping */ *state = -1; /* indicates invalid state */ + }
return 0; } @@ -770,6 +779,9 @@ static int __init powerclamp_init(void) goto exit_unregister; }
+ if (topology_max_packages() == 1) + poll_pkg_cstate_enable = true; + cooling_dev = thermal_cooling_device_register("intel_powerclamp", NULL, &powerclamp_cooling_ops); if (IS_ERR(cooling_dev)) {
linux-stable-mirror@lists.linaro.org