From: Mario Limonciello <mario.limonciello(a)amd.com>
On some platforms it has been observed that STT limits are not being applied
properly causing poor performance as power limits are set too low.
STT limits that are sent to the platform are supposed to be in Q8.8
format. Convert them before sending.
Reported-by: Yijun Shen <Yijun.Shen(a)dell.com>
Fixes: 7c45534afa443 ("platform/x86/amd/pmf: Add support for PMF Policy Binary")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/platform/x86/amd/pmf/tee-if.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/x86/amd/pmf/tee-if.c b/drivers/platform/x86/amd/pmf/tee-if.c
index 5d513161d7302..9a51258df0564 100644
--- a/drivers/platform/x86/amd/pmf/tee-if.c
+++ b/drivers/platform/x86/amd/pmf/tee-if.c
@@ -123,7 +123,7 @@ static void amd_pmf_apply_policies(struct amd_pmf_dev *dev, struct ta_pmf_enact_
case PMF_POLICY_STT_SKINTEMP_APU:
if (dev->prev_data->stt_skintemp_apu != val) {
- amd_pmf_send_cmd(dev, SET_STT_LIMIT_APU, false, val, NULL);
+ amd_pmf_send_cmd(dev, SET_STT_LIMIT_APU, false, val << 8, NULL);
dev_dbg(dev->dev, "update STT_SKINTEMP_APU: %u\n", val);
dev->prev_data->stt_skintemp_apu = val;
}
@@ -131,7 +131,7 @@ static void amd_pmf_apply_policies(struct amd_pmf_dev *dev, struct ta_pmf_enact_
case PMF_POLICY_STT_SKINTEMP_HS2:
if (dev->prev_data->stt_skintemp_hs2 != val) {
- amd_pmf_send_cmd(dev, SET_STT_LIMIT_HS2, false, val, NULL);
+ amd_pmf_send_cmd(dev, SET_STT_LIMIT_HS2, false, val << 8, NULL);
dev_dbg(dev->dev, "update STT_SKINTEMP_HS2: %u\n", val);
dev->prev_data->stt_skintemp_hs2 = val;
}
--
2.43.0
From: Saurabh Sengar <ssengar(a)linux.microsoft.com>
On a x86 system under test with 1780 CPUs, topology_span_sane() takes
around 8 seconds cumulatively for all the iterations. It is an expensive
operation which does the sanity of non-NUMA topology masks.
CPU topology is not something which changes very frequently hence make
this check optional for the systems where the topology is trusted and
need faster bootup.
Restrict this to sched_verbose kernel cmdline option so that this penalty
can be avoided for the systems who want to avoid it.
Cc: stable(a)vger.kernel.org
Fixes: ccf74128d66c ("sched/topology: Assert non-NUMA topology masks don't (partially) overlap")
Signed-off-by: Saurabh Sengar <ssengar(a)linux.microsoft.com>
Co-developed-by: Naman Jain <namjain(a)linux.microsoft.com>
Signed-off-by: Naman Jain <namjain(a)linux.microsoft.com>
Tested-by: K Prateek Nayak <kprateek.nayak(a)amd.com>
---
Changes since v4:
https://lore.kernel.org/all/20250306055354.52915-1-namjain@linux.microsoft.…
- Rephrased print statement and moved it to sched_domain_debug.
(addressing Valentin's comments)
Changes since v3:
https://lore.kernel.org/all/20250203114738.3109-1-namjain@linux.microsoft.c…
- Minor typo correction in comment
- Added Tested-by tag from Prateek for x86
Changes since v2:
https://lore.kernel.org/all/1731922777-7121-1-git-send-email-ssengar@linux.…
- Use sched_debug() instead of using sched_debug_verbose
variable directly (addressing Prateek's comment)
Changes since v1:
https://lore.kernel.org/all/1729619853-2597-1-git-send-email-ssengar@linux.…
- Use kernel cmdline param instead of compile time flag.
Adding a link to the other patch which is under review.
https://lore.kernel.org/lkml/20241031200431.182443-1-steve.wahl@hpe.com/
Above patch tries to optimize the topology sanity check, whereas this
patch makes it optional. We believe both patches can coexist, as even
with optimization, there will still be some performance overhead for
this check.
---
kernel/sched/topology.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index c49aea8c1025..d7254c47af45 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -132,8 +132,11 @@ static void sched_domain_debug(struct sched_domain *sd, int cpu)
{
int level = 0;
- if (!sched_debug_verbose)
+ if (!sched_debug_verbose) {
+ pr_info_once("%s: Scheduler topology debugging disabled, add 'sched_verbose' to the cmdline to enable it\n",
+ __func__);
return;
+ }
if (!sd) {
printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
@@ -2359,6 +2362,10 @@ static bool topology_span_sane(struct sched_domain_topology_level *tl,
{
int i = cpu + 1;
+ /* Skip the topology sanity check for non-debug, as it is a time-consuming operation */
+ if (!sched_debug())
+ return true;
+
/* NUMA levels are allowed to overlap */
if (tl->flags & SDTL_OVERLAP)
return true;
base-commit: 7ec162622e66a4ff886f8f28712ea1b13069e1aa
--
2.34.1
This reverts commit c0a40097f0bc81deafc15f9195d1fb54595cd6d0.
Probing a device can take arbitrary long time. In the field we observed
that, for example, probing a bad micro-SD cards in an external USB card
reader (or maybe cards were good but cables were flaky) sometimes takes
longer than 2 minutes due to multiple retries at various levels of the
stack. We can not block uevent_show() method for that long because udev
is reading that attribute very often and that blocks udev and interferes
with booting of the system.
The change that introduced locking was concerned with dev_uevent()
racing with unbinding the driver. However we can handle it without
locking (which will be done in subsequent patch).
There was also claim that synchronization with probe() is needed to
properly load USB drivers, however this is a red herring: the change
adding the lock was introduced in May of last year and USB loading and
probing worked properly for many years before that.
Revert the harmful locking.
Cc: stable(a)vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
---
drivers/base/core.c | 3 ---
1 file changed, 3 deletions(-)
v3: no changes.
v2: added Cc: stable, no code changes.
diff --git a/drivers/base/core.c b/drivers/base/core.c
index d2f9d3a59d6b..f9c1c623bca5 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2726,11 +2726,8 @@ static ssize_t uevent_show(struct device *dev, struct device_attribute *attr,
if (!env)
return -ENOMEM;
- /* Synchronize with really_probe() */
- device_lock(dev);
/* let the kset specific function add its keys */
retval = kset->uevent_ops->uevent(&dev->kobj, env);
- device_unlock(dev);
if (retval)
goto out;
--
2.49.0.rc0.332.g42c0ae87b1-goog