On 09/12/25 21:07, Shawn Guo wrote:
On Fri, Sep 12, 2025 at 12:41:14PM +0200, Rafael J. Wysocki wrote:
On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo shawnguo2@yeah.net wrote:
From: Shawn Guo shawnguo@kernel.org
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1), due to that platform's DT doesn't provide the optional property 'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms, because that the 10 ms cap for transition_delay_us was accidentally dropped by the commits below.
IIRC, this was not accidental.
I could be wrong, but my understanding is that the intention of Qais's commits is to drop 10 ms (and LATENCY_MULTIPLIER) as the *minimal* limit on transition_delay_us, so that it's possible to get a much less transition_delay_us on platforms like M1 mac mini where the transition latency is just tens of us. But it breaks platforms where 10 ms used to be the *maximum* limit.
Even if it's intentional to remove 10 ms as both the minimal and maximum limits, breaking some platforms must not be intentional, I guess :)
These limits were arbitrary. The limit was reduced to 2ms initially but then were dropped to avoid making assumptions as they are all arbitrary.
Why do you want to address the issue in the cpufreq core instead of doing that in the cpufreq-dt driver?
My intuition was to fix the regression at where the regression was introduced by recovering the code behavior.
Isn't the right fix here is at the driver level still? We can only give drivers what they ask for. If they ask for something wrong and result in something wrong, it is still their fault, no?
Alternatively maybe we can add special handling for CPUFREQ_ETERNAL value, though I'd suggest to return 1ms (similar to the case of value being 0). Maybe we can redefine CPUFREQ_ETERNAL to be 0, but not sure if this can have side effects.
Thanks
-- Qais Yousef