On Thu, 19 Oct 2023 at 16:49, Stephan Gerhold stephan@gerhold.net wrote:
On Thu, Oct 19, 2023 at 04:12:56PM +0200, Ulf Hansson wrote:
On Thu, 19 Oct 2023 at 15:05, Stephan Gerhold stephan@gerhold.net wrote:
On Thu, Oct 19, 2023 at 01:26:19PM +0200, Ulf Hansson wrote:
On Thu, 19 Oct 2023 at 12:24, Ulf Hansson ulf.hansson@linaro.org wrote:
On Wed, 18 Oct 2023 at 10:06, Stephan Gerhold stephan.gerhold@kernkonzept.com wrote:
The genpd core caches performance state votes from devices that are runtime suspended as of commit 3c5a272202c2 ("PM: domains: Improve runtime PM performance state handling"). They get applied once the device becomes active again.
To attach the power domains needed by qcom-cpufreq-nvmem the OPP core calls genpd_dev_pm_attach_by_id(). This results in "virtual" dummy devices that use runtime PM only to control the enable and performance state for the attached power domain.
However, at the moment nothing ever resumes the virtual devices created for qcom-cpufreq-nvmem. They remain permanently runtime suspended. This means that performance state votes made during cpufreq scaling get always cached and never applied to the hardware.
Fix this by enabling the devices after attaching them and use dev_pm_syscore_device() to ensure the power domains also stay on when going to suspend. Since it supplies the CPU we can never turn it off from Linux. There are other mechanisms to turn it off when needed, usually in the RPM firmware (RPMPD) or the cpuidle path (CPR genpd).
I believe we discussed using dev_pm_syscore_device() for the previous version. It's not intended to be used for things like the above.
Sorry, looks like we still had a misunderstanding in the conclusion of the previous discussion. :')
Moreover, I was under the impression that it wasn't really needed. In fact, I would think that this actually breaks things for system suspend/resume, as in this case the cpr driver's genpd ->power_on|off() callbacks are no longer getting called due this, which means that the cpr state machine isn't going to be restored properly. Or did I get this wrong?
We strictly need the RPMPDs to be always-on, also across system suspend [1]. The RPM firmware will drop the votes internally as soon as the CPU(s) have entered deep cpuidle. We can't do this from Linux, because we need the CPU to continue running until it was shut down cleanly.
For CPR, we strictly need the backing regulator to be always-on, also across system suspend. Typically the hardware will turn off the regulator as soon as the CPU(s) enter deep cpuidle. Similarly, we can't do this from Linux, because we need the CPU to continue running until it was shut down cleanly.
My understanding was that we're going to pause the CPR state machine using the system suspend/resume callbacks on the driver, instead of using the genpd->power_on|off() callbacks [2]. I can submit a separate patch for this.
If we are going to do 1) as described below, this looks to me that it's going to be needed.
Yep.
How will otherwise the cpr state machine be saved/restored during system suspend/resume? Note that, beyond 1), the genpd's ->power_on|off() callbacks are no longer going to be called during system suspend/resume.
(Side note: I think "save/restore" might be the wrong words for suspend/resume of CPR. Looking at the code most of the configuration appears to be preserved across suspend/resume. Nothing is saved, it literally just disables the state machine during suspend and re-enables it during resume.
I'm not entirely sure what's the reason for doing this. Perhaps the main goal is just to prevent the CPR state machine from getting stuck or sending pointless IRQs that won't be handled while Linux is suspended.)
If only the latter, that is a very good reason too. Drivers should take care of their devices to make sure they are not triggering spurious irqs during system suspend.
In a way this also means that the cpr genpd provider might as well also have GENPD_FLAG_ALWAYS_ON set for it.
Conceptually I would consider CPR to be a generic power domain provider that could supply any kind of device. I know at least of CPUs and GPUs. We need "always-on" only for the CPU, but not necessarily for other devices.
For a GPU, the Linux driver (running on the CPU) can stop the GPU, wait for completion and then invoke the ->power_off() callback of CPR. In that case it is also safe to disable the backing regulator from Linux. (I briefly mentioned this already in the previous discussion I think.)
We could set GENPD_FLAG_ALWAYS_ON for the CPR compatibles where we know that they are only used to supply CPUs, but if we're going to do (1) anyway there might not be much of an advantage for the extra complexity.
Okay, fair enough. Let's just stick with 1) and skip using GENPD_FLAG_ALWAYS_ON bit for the cpr genpd provider.
I didn't prioritize this because QCS404 (as the only current user of CPR) doesn't have proper deep cpuidle/power management set up yet. It's not entirely clear to me if there is any advantage (or perhaps even disadvantage) if we pause the CPR state machine while the shared L2 cache is still being actively powered by the CPR power rail during system suspend. I suspect this is a configuration that was never considered in the hardware design.
I see.
Given the strict requirement for the RPMPDs, I only see two options:
- Have an always-on consumer that prevents the power domains to be powered off during system suspend. This is what this patch tries to achieve.
Or:
- Come up with a way to register the RPMPDs used by the CPU with GENPD_FLAG_ALWAYS_ON. This would also be doable, but isn't as straightfoward as "regulator-always-on" in the DT because the rpmpd DT node represents multiple genpds in a single DT node [3].
Yes, it sounds like it may be easier to do 1).
What do you think? Do you see some other solution perhaps? I hope we can clear up the misunderstanding. :-)
Yes, thanks!
BTW, if you really need something like the above, the proper way to do it would instead be to call device_set_awake_path() for the device.
This informs genpd that the device needs to stay powered-on during system suspend (assuming that GENPD_FLAG_ACTIVE_WAKEUP has been set for it), hence it will keep the corresponding PM domain powered-on too.
Thanks, I can try if this works as alternative to the dev_pm_syscore_device()!
Yes, please. We don't want to abuse the dev_pm_syscore_device() thingy.
Could you clarify the idea behind GENPD_FLAG_ACTIVE_WAKEUP? Would I set it conditionally for all RPMPDs or just the ones consumed by the CPU? How does the genpd *provider* know if one of its *consumer* devices needs to have its power domain kept on for wakeup?
We are thinking of the GENPD_FLAG_ACTIVE_WAKEUP as a platform configuration type of flag for the genpd in question. The consumer driver shouldn't need to know about the details of what is happening on the PM domain level - only whether it needs its device to remain powered-on during system suspend or not.
I suspect that the GENPD_FLAG_ACTIVE_WAKEUP is probably okay to set for most genpds, but there may be some exceptions.
Kind regards Uffe