On Wed, Jul 02, 2025 at 12:53:07PM GMT, Bjorn Helgaas wrote:
On Wed, Jul 02, 2025 at 12:17:00PM +0530, Manivannan Sadhasivam wrote:
On Tue, Jul 01, 2025 at 03:35:26PM -0500, Bjorn Helgaas wrote:
[+cc Bart]
On Tue, Jul 01, 2025 at 12:17:31PM +0530, Manivannan Sadhasivam wrote:
If devicetree describes power supplies related to a PCI device, we previously created a pwrctrl device even if CONFIG_PCI_PWRCTL was not enabled.
When pci_pwrctrl_create_device() creates and returns a pwrctrl device, pci_scan_device() doesn't enumerate the PCI device. It assumes the pwrctrl core will rescan the bus after turning on the power. However, if CONFIG_PCI_PWRCTL is not enabled, the rescan never happens.
Separate from this patch, can we refine the comment in pci_scan_device() to explain *why* we should skip scanning if a pwrctrl device was created? The current comment leaves me with two questions:
- How do we know the pwrctrl device is currently off? If it is already on, why should we defer enumerating the device?
I believe you meant to ask "how do we know the PCI device is currently off". If the pwrctrl device is created, then we for sure know that the pwrctrl driver will power on the PCI device at some point (depending on when the driver gets loaded). Even if the device was already powered on, we do not want to probe the client driver because, we have seen race between pwrctrl driver and PCI client driver probing in parallel. So I had imposed a devlink dependency (see b458ff7e8176) that makes sure that the PCI client driver wouldn't get probed until the pwrctrl driver (if the pwrctrl device was created) is probed. This will ensure that the PCI device state is reset and initialized by the pwrctrl driver before the client driver probes.
I'm confused about this. Assume there is a pwrctrl device and the related PCI device is already powered on when Linux boots. Apparently we do NOT want to enumerate the PCI device? We want to wait for the pwrctrl driver to claim the pwrctrl device and do a rescan? Even though the pwrctrl driver may be a loadable module and may not even be available at all?
It seems to me that a PCI device that is already powered on should be enumerated and made available. If there's a pwrctrl device for it, and we decide to load pwrctrl, then we also get the ability to turn the PCI device off and on again as needed. But if we *don't* load pwrctrl, it seems like we should still be able to use a PCI device that's already powered on.
The problem with enumerating the PCI device which was already powered on is that the pwrctrl driver cannot reliably know whether the device is powered on or not. So by the time the pwrctrl driver probes, the client driver might also be probing. For the case of WLAN chipsets, the pwrctrl driver used to sample the EN (Enable) GPIO pin to know whether the device is powered on or not (see a9aaf1ff88a8), but that also turned out to be racy and people were complaining.
So to simplify things, we enforced this dependency.
- If the pwrctrl device is currently off, won't the Vendor ID read just fail like it does for every other non-existent device? If so, why can't we just let that happen?
Again, it is not the pwrctrl device that is off, it is the PCI device. If it is not turned on, yes VID read will fail, but why do we need to read the VID in the first place if we know that the PCI device requires pwrctrl and the pwrctrl driver is going to be probed later.
I was assuming pwrctrl is only required if we want to turn the PCI device power on or off. Maybe that's not true?
Pretty much so, but we will also use it to do D3Cold (during system suspend) in the near future.
This behavior is from 2489eeb777af ("PCI/pwrctrl: Skip scanning for the device further if pwrctrl device is created"), which just says "there's no need to continue scanning." Prior to 2489eeb777af, it looks like we *did* what try to enumerate the device even if a pwrctrl device was created, and 2489eeb777af doesn't mention a bug fix, so I assume it's just an optimization.
Yes, it is indeed an optimization.
To summarize, we have imposed a dependency between the PCI client driver and pwrctrl driver to make sure that the PCI device state is fully reset and initialized before the client driver probes.
If the PCI device is already powered on, what more should we do to fully reset and initialize it? If it needed more initialization, I assume the platform should have left it powered off.
As I mentioned above, we cannot reliably detect whether a device is already powered on or not from the pwrctrl driver when it probes. So because of that reason, we enforce dependency and always reset/initialize the device to POR state. If there is a reliable way
- Mani