The current code in pcie_wait_for_link_delay() handles the value returned by pcie_failed_link_retrain() as an integer, expecting 0 when the link has been successfully retrained. The issue is that pcie_failed_link_retrain() returns a boolean: "true" if the link has been successfully retrained and "false" otherwise. This leads pcie_wait_for_link_delay() to return an incorrect "active link" status when pcie_failed_link_retrain() is called.
This patch fixes the check of the value returned by pcie_failed_link_retrain() in pcie_wait_for_link_delay().
Note that this bug induces abnormal timeout delays when a PCI device is unplugged (around 60 seconds per bridge / secondary bus removed).
Cc: stable@vger.kernel.org Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Signed-off-by: Simon Guinot simon.guinot@seagate.com --- drivers/pci/pci.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index ccee56615f78..7ec91b4c5d03 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5101,9 +5101,7 @@ static bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, msleep(20); rc = pcie_wait_for_link_status(pdev, false, active); if (active) { - if (rc) - rc = pcie_failed_link_retrain(pdev); - if (rc) + if (rc && !pcie_failed_link_retrain(pdev)) return false;
msleep(delay);
On Wed, 13 Mar 2024, Simon Guinot wrote:
The current code in pcie_wait_for_link_delay() handles the value returned by pcie_failed_link_retrain() as an integer, expecting 0 when the link has been successfully retrained. The issue is that pcie_failed_link_retrain() returns a boolean: "true" if the link has been successfully retrained and "false" otherwise. This leads pcie_wait_for_link_delay() to return an incorrect "active link" status when pcie_failed_link_retrain() is called.
This patch fixes the check of the value returned by pcie_failed_link_retrain() in pcie_wait_for_link_delay().
Note that this bug induces abnormal timeout delays when a PCI device is unplugged (around 60 seconds per bridge / secondary bus removed).
Cc: stable@vger.kernel.org Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Signed-off-by: Simon Guinot simon.guinot@seagate.com
Hi Simon,
Thanks for your patch. There's, however, already a better series to fix this and other related issues. Bjorn just hasn't gotten into applying them yet:
https://patchwork.kernel.org/project/linux-pci/list/?series=824858
(I proposed a patch very similar to yours month ago, but Maciej came up a better way to fix all the issues.)
-- i.
drivers/pci/pci.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index ccee56615f78..7ec91b4c5d03 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5101,9 +5101,7 @@ static bool pcie_wait_for_link_delay(struct pci_dev *pdev, bool active, msleep(20); rc = pcie_wait_for_link_status(pdev, false, active); if (active) {
if (rc)
rc = pcie_failed_link_retrain(pdev);
if (rc)
if (rc && !pcie_failed_link_retrain(pdev)) return false;
msleep(delay);
On Wed, Mar 13, 2024 at 02:00:21PM +0200, Ilpo Järvinen wrote:
On Wed, 13 Mar 2024, Simon Guinot wrote:
The current code in pcie_wait_for_link_delay() handles the value returned by pcie_failed_link_retrain() as an integer, expecting 0 when the link has been successfully retrained. The issue is that pcie_failed_link_retrain() returns a boolean: "true" if the link has been successfully retrained and "false" otherwise. This leads pcie_wait_for_link_delay() to return an incorrect "active link" status when pcie_failed_link_retrain() is called.
This patch fixes the check of the value returned by pcie_failed_link_retrain() in pcie_wait_for_link_delay().
Note that this bug induces abnormal timeout delays when a PCI device is unplugged (around 60 seconds per bridge / secondary bus removed).
Cc: stable@vger.kernel.org Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'") Signed-off-by: Simon Guinot simon.guinot@seagate.com
Hi Simon,
Thanks for your patch. There's, however, already a better series to fix this and other related issues. Bjorn just hasn't gotten into applying them yet:
https://patchwork.kernel.org/project/linux-pci/list/?series=824858
(I proposed a patch very similar to yours month ago, but Maciej came up a better way to fix all the issues.)
Hi Ilpo,
Thanks for pointing this patch series. This indeed fixes the timeout delay issue I observed.
Simon
linux-stable-mirror@lists.linaro.org