On Fri, 28 Feb 2025, Lukas Wunner wrote:
On Thu, Feb 27, 2025 at 06:31:08AM +0100, Lukas Wunner wrote:
pcie_bwnotif_irq() accesses the Link Status register without acquiring a runtime PM reference on the PCIe port. This feels wrong and may also contribute to the issue reported here. Acquiring a runtime PM ref may sleep, so I think you need to change the driver to use a threaded IRQ handler.
I've realized we've had a discussion before why a threaded IRQ handler doesn't make sense...
Yes.
https://lore.kernel.org/all/Z35qJ3H_8u5LQDJ6@wunner.de/
...but I'm still worried that a Downstream Port in a nested-switch configuration may be runtime suspended while the hardirq handler is running. Is there anything preventing that from happening?
I don't think there is.
To access config space of a port, it's sufficient if its upstream bridge is runtime active (i.e. in PCI D0).
So basically the below is what I have in mind. This assumes that the upstream bridge is still in D0 when the interrupt handler runs because in atomic context we can't wait for it to be runtime resumed. Seems like a fair assumption to me but what do I know...
bwctrl doesn't even want to resume the port in the irqhandler. If the port is suspended, why would it have LBMS/LABS, and we disabled notifications anyway in suspend handler anyway so we're not even expecting them to come during a period of suspend (which does not mean there couldn't be interrupts due to other sources).
So there should be no problem in not calling resume for it.
-- >8 --
diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c index 0a5e7efbce2c..fea8f7412266 100644 --- a/drivers/pci/pcie/bwctrl.c +++ b/drivers/pci/pcie/bwctrl.c @@ -28,6 +28,7 @@ #include <linux/mutex.h> #include <linux/pci.h> #include <linux/pci-bwctrl.h> +#include <linux/pm_runtime.h> #include <linux/rwsem.h> #include <linux/slab.h> #include <linux/types.h> @@ -235,9 +236,13 @@ static irqreturn_t pcie_bwnotif_irq(int irq, void *context) struct pcie_device *srv = context; struct pcie_bwctrl_data *data = srv->port->link_bwctrl; struct pci_dev *port = srv->port;
- struct device *parent __free(pm_runtime_put) = port->dev.parent; u16 link_status, events; int ret;
- if (parent)
pm_runtime_get_noresume(parent);
Should this then check if its suspended and return early if it is suspended?
pm_runtime_suspended() has some caveats in the kerneldoc though so I'm a bit unsure if it can be called safely here, probably not.
ret = pcie_capability_read_word(port, PCI_EXP_LNKSTA, &link_status); if (ret != PCIBIOS_SUCCESSFUL) return IRQ_NONE; diff --git a/include/linux/pm_runtime.h b/include/linux/pm_runtime.h index d39dc863f612..038228de773d 100644 --- a/include/linux/pm_runtime.h +++ b/include/linux/pm_runtime.h @@ -448,6 +448,8 @@ static inline int pm_runtime_put(struct device *dev) return __pm_runtime_idle(dev, RPM_GET_PUT | RPM_ASYNC); } +DEFINE_FREE(pm_runtime_put, struct device *, if (_T) pm_runtime_put(_T))
/**
- __pm_runtime_put_autosuspend - Drop device usage counter and queue autosuspend if 0.
- @dev: Target device.