In the future, please send this to the regressions M/L and CC people instead of just sending a private message.
For now, I've added the @regressions and @stable mailing lists as this is an issue you find exposed specifically in the LTS series.
Hi Lars,
Can you please test 6.9.7? If this is still failing, can you please check 6.10-rc6?
I'd like to understand if we just have a missing commit to backport or it's a problem in the mainline kernel as well.
From the below description it's specifically with boost in passive mode, right?
If 6.10-rc6 is still affected, can you please see if this commit helps? https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/commit/?h=...
This is going into 6.11-rc1.
Perry, Jassmine,
Can you try to repro this using bleeding-edge or linux-next branches?
Thanks,
On 7/1/2024 4:33, Huang, Ray wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
Hi all,
Could you please help for a quick fix?
-----Original Message----- From: Lars Wendler wendler.lars@web.de Sent: Monday, July 1, 2024 5:30 PM To: Huang, Ray Ray.Huang@amd.com Cc: gregkh@linuxfoundation.org Subject: linux-6.6.y: Regression in amd-pstate cpufreq driver since 6.6.34
Hello dear kernel developers,
I might have found a regression in the amd-pstate driver of linux-6.6 stable series. I haven't checked linux-master nor any other LTS branch.
Now here's what I have found:
Since linux-6.6.34 the following command fails:
# echo 0 > /sys/devices/system/cpu/cpufreq/boost -bash: echo: write error: Invalid argument
and indeed, disabling CPU boost seems to not work:
# cat /sys/devices/system/cpu/cpufreq/boost 1
I have bisected the issue to commit 8f893e52b9e030a25ea62e31271bf930b01f2f07:
cpufreq: amd-pstate: Fix the inconsistency in max frequency units
commit e4731baaf29438508197d3a8a6d4f5a8c51663f8 upstream.
Reverting that commit (even on latest linux-6.6 release) gives me back the ability to disable CPU boost again.
I can only reproduce this bug on my Zen4 machine:
# lscpu | grep "^Model name:" | sed 's@[[:space:]][[:space:]]+@ @' Model name: AMD Ryzen 7 7745HX with Radeon Graphics
My older Zen3 machines seem not to be affected by this issue. All my Ryzen systems run on latest linux-6.6 kernels and have the following configuration regarding amd-pstate:
# zgrep -F AMD_PSTATE /proc/config.gz CONFIG_X86_AMD_PSTATE=y CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=2 # CONFIG_X86_AMD_PSTATE_UT is not set
If you need more information, please don't hesitate to ask.
Kind regards Lars Wendler
Hello Mario,
Am Mon, 1 Jul 2024 10:07:59 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
In the future, please send this to the regressions M/L and CC people instead of just sending a private message.
For now, I've added the @regressions and @stable mailing lists as this is an issue you find exposed specifically in the LTS series.
Hi Lars,
Can you please test 6.9.7? If this is still failing, can you please check 6.10-rc6?
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
I'd like to understand if we just have a missing commit to backport or it's a problem in the mainline kernel as well.
From the below description it's specifically with boost in passive mode, right?
I have only tested the passive mode on all my Ryzen systems and only my Zen4 machine shows this regression.
If 6.10-rc6 is still affected, can you please see if this commit helps? https://git.kernel.org/pub/scm/linux/kernel/git/superm1/linux.git/commit/?h=...
This is going into 6.11-rc1.
Perry, Jassmine,
Can you try to repro this using bleeding-edge or linux-next branches?
Thanks,
On 7/1/2024 4:33, Huang, Ray wrote:
[AMD Official Use Only - AMD Internal Distribution Only]
Hi all,
Could you please help for a quick fix?
-----Original Message----- From: Lars Wendler wendler.lars@web.de Sent: Monday, July 1, 2024 5:30 PM To: Huang, Ray Ray.Huang@amd.com Cc: gregkh@linuxfoundation.org Subject: linux-6.6.y: Regression in amd-pstate cpufreq driver since 6.6.34
Hello dear kernel developers,
I might have found a regression in the amd-pstate driver of linux-6.6 stable series. I haven't checked linux-master nor any other LTS branch.
Now here's what I have found:
Since linux-6.6.34 the following command fails:
# echo 0 > /sys/devices/system/cpu/cpufreq/boost -bash: echo: write error: Invalid argument
and indeed, disabling CPU boost seems to not work:
# cat /sys/devices/system/cpu/cpufreq/boost 1
I have bisected the issue to commit 8f893e52b9e030a25ea62e31271bf930b01f2f07:
cpufreq: amd-pstate: Fix the inconsistency in max frequency units
commit e4731baaf29438508197d3a8a6d4f5a8c51663f8 upstream.
Reverting that commit (even on latest linux-6.6 release) gives me back the ability to disable CPU boost again.
I can only reproduce this bug on my Zen4 machine:
# lscpu | grep "^Model name:" | sed 's@[[:space:]][[:space:]]+@ @' Model name: AMD Ryzen 7 7745HX with Radeon Graphics
My older Zen3 machines seem not to be affected by this issue. All my Ryzen systems run on latest linux-6.6 kernels and have the following configuration regarding amd-pstate:
# zgrep -F AMD_PSTATE /proc/config.gz CONFIG_X86_AMD_PSTATE=y CONFIG_X86_AMD_PSTATE_DEFAULT_MODE=2 # CONFIG_X86_AMD_PSTATE_UT is not set
If you need more information, please don't hesitate to ask.
Kind regards Lars Wendler
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
I'd like to understand if we just have a missing commit to backport or it's a problem in the mainline kernel as well.
From the below description it's specifically with boost in passive mode, right?
I have only tested the passive mode on all my Ryzen systems and only my Zen4 machine shows this regression.
That's an interesting finding. Do you know if your other system(s) support preferred cores?
Also as a curiosity why don't you use active mode (EPP)? Most people find a better balance with perf/efficiency with EPP.
Hello Mario,
Am Mon, 1 Jul 2024 10:58:17 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
that commit does not fix the regression.
I'd like to understand if we just have a missing commit to backport or it's a problem in the mainline kernel as well.
From the below description it's specifically with boost in passive mode, right?
I have only tested the passive mode on all my Ryzen systems and only my Zen4 machine shows this regression.
That's an interesting finding. Do you know if your other system(s) support preferred cores?
Also as a curiosity why don't you use active mode (EPP)? Most people find a better balance with perf/efficiency with EPP.
On 7/1/2024 11:13, Lars Wendler wrote:
Hello Mario,
Am Mon, 1 Jul 2024 10:58:17 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
that commit does not fix the regression.
I think I might have found the issue.
With that commit backported on 6.6.y in amd_pstate_set_boost() the policy max frequency is nominal *1000 [1].
However amd_get_nominal_freq() already returns nominal *1000 [2].
If you compare on 6.9 get_nominal_freq() doesn't return * 1000 [3].
So the patch only makes sense on 6.9 and later.
We should revert it in 6.6.y.
Greg,
Can you please revert 8f893e52b9e0 ("cpufreq: amd-pstate: Fix the inconsistency in max frequency units") in 6.6.y?
[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/driver... [2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/driver... [3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/driver...
On Mon, Jul 01, 2024 at 04:53:20PM -0500, Mario Limonciello wrote:
On 7/1/2024 11:13, Lars Wendler wrote:
Hello Mario,
Am Mon, 1 Jul 2024 10:58:17 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
that commit does not fix the regression.
I think I might have found the issue.
With that commit backported on 6.6.y in amd_pstate_set_boost() the policy max frequency is nominal *1000 [1].
However amd_get_nominal_freq() already returns nominal *1000 [2].
If you compare on 6.9 get_nominal_freq() doesn't return * 1000 [3].
So the patch only makes sense on 6.9 and later.
We should revert it in 6.6.y.
Greg,
Can you please revert 8f893e52b9e0 ("cpufreq: amd-pstate: Fix the inconsistency in max frequency units") in 6.6.y?
Sure, but why only 6.6.y? What about 6.1.y, should it be reverted from there as well?
thanks,
greg k-h
On Tue, Jul 02, 2024 at 11:15:14AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jul 01, 2024 at 04:53:20PM -0500, Mario Limonciello wrote:
On 7/1/2024 11:13, Lars Wendler wrote:
Hello Mario,
Am Mon, 1 Jul 2024 10:58:17 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
that commit does not fix the regression.
I think I might have found the issue.
With that commit backported on 6.6.y in amd_pstate_set_boost() the policy max frequency is nominal *1000 [1].
However amd_get_nominal_freq() already returns nominal *1000 [2].
If you compare on 6.9 get_nominal_freq() doesn't return * 1000 [3].
So the patch only makes sense on 6.9 and later.
We should revert it in 6.6.y.
Greg,
Can you please revert 8f893e52b9e0 ("cpufreq: amd-pstate: Fix the inconsistency in max frequency units") in 6.6.y?
Sure, but why only 6.6.y? What about 6.1.y, should it be reverted from there as well?
And have now done so.
On 7/2/2024 4:23, Greg Kroah-Hartman wrote:
On Tue, Jul 02, 2024 at 11:15:14AM +0200, Greg Kroah-Hartman wrote:
On Mon, Jul 01, 2024 at 04:53:20PM -0500, Mario Limonciello wrote:
On 7/1/2024 11:13, Lars Wendler wrote:
Hello Mario,
Am Mon, 1 Jul 2024 10:58:17 -0500 schrieb Mario Limonciello mario.limonciello@amd.com:
I've tested both, 6.9.7 and 6.10-rc6 and they both don't have that issue. I can disable CPU boost with both kernel versions.
Thanks for checking those. That's good to hear it's only an issue in the LTS series.
It means we have the option to either drop that patch from LTS kernel series or identify the other commit(s) that helped it.
Can you see if adding this commit to 6.6.y helps you?
https://git.kernel.org/superm1/c/8164f743326404fbe00a721a12efd86b2a8d74d2
that commit does not fix the regression.
I think I might have found the issue.
With that commit backported on 6.6.y in amd_pstate_set_boost() the policy max frequency is nominal *1000 [1].
However amd_get_nominal_freq() already returns nominal *1000 [2].
If you compare on 6.9 get_nominal_freq() doesn't return * 1000 [3].
So the patch only makes sense on 6.9 and later.
We should revert it in 6.6.y.
Greg,
Can you please revert 8f893e52b9e0 ("cpufreq: amd-pstate: Fix the inconsistency in max frequency units") in 6.6.y?
Sure, but why only 6.6.y? What about 6.1.y, should it be reverted from there as well?
And have now done so.
Thanks; totally agree with you. I just didn't realize it was backported to 6.1 also.
linux-stable-mirror@lists.linaro.org