commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
Cc: stable@vger.kernel.org Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Reviewed-by:Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com --- In v14 this series has been split into 3 parts. part A: Immediate fix for AMD issue. part B: LPS0 export improvements part C: Long term solution for all vendors v13->v14: * Reword the comment * add tag v12->v13: * New patch --- drivers/pci/pci.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..bfdad2eb36d13 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false;
/* - * It should be safe to put PCIe ports from 2015 or newer - * to D3. + * Allow Intel PCIe ports from 2015 onward to go into D3 to + * achieve additional energy conservation on some platforms. + * + * This is only set for Intel PCIe ports as it causes problems + * on both AMD Rembrandt and Phoenix platforms where USB keyboards + * can not be used to wake the system from suspend. */ - if (dmi_get_bios_year() >= 2015) + if (bridge->vendor == PCI_VENDOR_ID_INTEL && + dmi_get_bios_year() >= 2015) return true; break; }
On Fri, Aug 18, 2023 at 9:40 PM Mario Limonciello mario.limonciello@amd.com wrote:
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
Cc: stable@vger.kernel.org Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Reviewed-by:Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com
Acked-by: Rafael J. Wysocki rafael@kernel.org
In v14 this series has been split into 3 parts. part A: Immediate fix for AMD issue. part B: LPS0 export improvements part C: Long term solution for all vendors v13->v14:
- Reword the comment
- add tag
v12->v13:
- New patch
drivers/pci/pci.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..bfdad2eb36d13 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false;
/*
* It should be safe to put PCIe ports from 2015 or newer
* to D3.
* Allow Intel PCIe ports from 2015 onward to go into D3 to
* achieve additional energy conservation on some platforms.
*
* This is only set for Intel PCIe ports as it causes problems
* on both AMD Rembrandt and Phoenix platforms where USB keyboards
* can not be used to wake the system from suspend. */
if (dmi_get_bios_year() >= 2015)
if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
dmi_get_bios_year() >= 2015) return true; break; }
-- 2.34.1
On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
Cc: stable@vger.kernel.org Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Reviewed-by:Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com
In v14 this series has been split into 3 parts. part A: Immediate fix for AMD issue. part B: LPS0 export improvements part C: Long term solution for all vendors v13->v14:
- Reword the comment
- add tag
v12->v13:
- New patch
drivers/pci/pci.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..bfdad2eb36d13 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false; /*
* It should be safe to put PCIe ports from 2015 or newer
* to D3.
* Allow Intel PCIe ports from 2015 onward to go into D3 to
* achieve additional energy conservation on some platforms.
*
* This is only set for Intel PCIe ports as it causes problems
* on both AMD Rembrandt and Phoenix platforms where USB keyboards
*/* can not be used to wake the system from suspend.
if (dmi_get_bios_year() >= 2015)
if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
dmi_get_bios_year() >= 2015) return true;
Hmm. I'm really not a fan of checks like this that aren't connected to an actual property of the platform. The Intel Vendor ID tells us nothing about what the actual problem is, which makes it really hard to maintain in the future. It's also very AMD- and Intel-centric, when this code is ostensibly arch-agnostic, so this potentially regresses ARM64, RISC-V, powerpc, etc.
It's bad enough that we check for 2015. A BIOS security update to a 2014 platform will break things, even though the update has nothing to do with D3. We're stuck with that one, and it's old enough that maybe it won't bite us any more, but I hate to add more.
The list of conditions in pci_bridge_d3_possible() is a pretty good clue that we don't really know what we're doing, and all we can do is find configurations that happen to work.
I don't have any better suggestions, other than that this should be described somehow via ACPI (and not in vendor-specific stuff like PNP0D80).
Bjorn
On 8/21/2023 5:42 PM, Bjorn Helgaas wrote:
On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
Cc: stable@vger.kernel.org Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Reviewed-by:Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com
In v14 this series has been split into 3 parts. part A: Immediate fix for AMD issue. part B: LPS0 export improvements part C: Long term solution for all vendors v13->v14:
- Reword the comment
- add tag
v12->v13:
- New patch
drivers/pci/pci.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..bfdad2eb36d13 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false; /*
* It should be safe to put PCIe ports from 2015 or newer
* to D3.
* Allow Intel PCIe ports from 2015 onward to go into D3 to
* achieve additional energy conservation on some platforms.
*
* This is only set for Intel PCIe ports as it causes problems
* on both AMD Rembrandt and Phoenix platforms where USB keyboards
*/* can not be used to wake the system from suspend.
if (dmi_get_bios_year() >= 2015)
if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
dmi_get_bios_year() >= 2015) return true;
Hmm. I'm really not a fan of checks like this that aren't connected to an actual property of the platform. The Intel Vendor ID tells us nothing about what the actual problem is, which makes it really hard to maintain in the future. It's also very AMD- and Intel-centric, when this code is ostensibly arch-agnostic, so this potentially regresses ARM64, RISC-V, powerpc, etc.
It's bad enough that we check for 2015. A BIOS security update to a 2014 platform will break things, even though the update has nothing to do with D3. We're stuck with that one, and it's old enough that maybe it won't bite us any more, but I hate to add more.
I don't see this change as adding any more checks. It's correcting what should have been a more narrow check when that one was introduced.
There was no spec backing the 2015 date, it was just "hardware works with this and it's needed".
The list of conditions in pci_bridge_d3_possible() is a pretty good clue that we don't really know what we're doing, and all we can do is find configurations that happen to work.
I see this function as stemming from three high level desires that intersect.
1) Make hardware not originally designated for the PC ecosystem work. Macs fall in this camp. They don't always adhere to the same "firmware norms" as UEFI PCs do. They don't run Microsoft's certifications, and thus they don't always work the same.
2) Make hardware compatible to the the ACPI and PCI specs where possible.
3) Make hardware compatible with Windows behavior.
In a strict world <2> would be the only one that was followed and everything else would be relegated to quirks or sub-drivers that made decisions, but we are where we are.
I'm not saying it's a bad thing that all 3 goals overlap. If not for the changes that were introduced into this function for Mac compatibility I don't think we would Thunderbolt/USB4 where it is today.
I don't have any better suggestions, other than that this should be described somehow via ACPI (and not in vendor-specific stuff like PNP0D80).
Bjorn
The problem is the ACPI spec doesn't say what OSPM should do when something isn't power manageable by ACPI. Can you really argue it should?
Even if we DID manage to get something added to the spec how does that help everything already on the marketplace that's broken?
If we can't fix the check to be more narrow the only options I see left:
0) Do nothing.
Document somewhere that if you're on AMD and care about wake from keyboard that you need pcie_port_pm=off. I really don't think this is a good idea. Here's why:
Completely separate from this wake from USB issue I know of another issue where some PHX BIOS versions HANG the system during resume because of root port being in D3.
It's fixed in newer PHX BIOS versions, but if you end up with an OEM system that happened to launch with one of these you're in a very bad place. I haven't mentioned it because I've not seen an OEM system with this yet.
If we "do nothing" the only way to solve those will be to grow the DMI avoid list if any of these come up.
1) Hardcode all the Intel root ports that need D3 into a quirk list.
I don't know how big this list is, but I assume it's massive and doesn't scale unless the constraints stuff works for Intel to opt in the newer things.
2) Hardcode quirks for all the affected AMD PCI root ports to avoid D3.
It's 4 of them for the 2022-2023 platforms. It will probably be another 2 more for 2024 ones, and then another 2 more for 2025 ones. Maybe by 2025 the affected root port can handle D3 and let USB wake it up? I don't know.
3) Move the quirk somewhere that AMD can maintain specifically for this case outside of drivers/pci.
I did prototype it being put into drivers/platform/x86/amd/pmc.c or drivers/platform/x86/amd/pmc_quirks.c instead as part of the suspend callbacks or LPS0 callbacks.
This works and can scale as that driver at a minimum gets new IDs added for new platforms. However it makes all this logic much more convoluted and harder for anyone to follow.
On Tue, Aug 22, 2023 at 12:42 AM Bjorn Helgaas helgaas@kernel.org wrote:
On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
Cc: stable@vger.kernel.org Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Reported-by: Iain Lane iain@orangesquash.org.uk Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from-suspend-with-exter... Reviewed-by:Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com
In v14 this series has been split into 3 parts. part A: Immediate fix for AMD issue. part B: LPS0 export improvements part C: Long term solution for all vendors v13->v14:
- Reword the comment
- add tag
v12->v13:
- New patch
drivers/pci/pci.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 60230da957e0c..bfdad2eb36d13 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false;
/*
* It should be safe to put PCIe ports from 2015 or newer
* to D3.
* Allow Intel PCIe ports from 2015 onward to go into D3 to
* achieve additional energy conservation on some platforms.
*
* This is only set for Intel PCIe ports as it causes problems
* on both AMD Rembrandt and Phoenix platforms where USB keyboards
* can not be used to wake the system from suspend. */
if (dmi_get_bios_year() >= 2015)
if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
dmi_get_bios_year() >= 2015) return true;
Hmm. I'm really not a fan of checks like this that aren't connected to an actual property of the platform. The Intel Vendor ID tells us nothing about what the actual problem is, which makes it really hard to maintain in the future. It's also very AMD- and Intel-centric, when this code is ostensibly arch-agnostic, so this potentially regresses ARM64, RISC-V, powerpc, etc.
That's a fair point.
Would it be better to reverse this and filter out AMD systems as they are affected by the existing check?
It's bad enough that we check for 2015. A BIOS security update to a 2014 platform will break things,
Well, not necessarily. Pre-2015 systems already worked and the check was added as "surely, everything 2015 or newer should work either". While it is true that putting PCIe Root Ports into D3hot was necessary for extra energy conservation on Intel systems, it actually has been expected to work everywhere.
even though the update has nothing to do with D3. We're stuck with that one, and it's old enough that maybe it won't bite us any more, but I hate to add more.
Well, how would you like to deal with the systems that don't work today, because they expect a different behavior?
Effectively, the current behavior for all modern systems is to allow bridge D3 if there are no indications that it shouldn't be allowed. The platforms in question assume the reverse, so what else can be done?
The list of conditions in pci_bridge_d3_possible() is a pretty good clue that we don't really know what we're doing, and all we can do is find configurations that happen to work.
Yes, because by the spec it all should work just fine. The PCI PM 1.2 specification defines the expected behavior for bridges and the PCIe specification claims to be a superset of that.
What we need to deal with here is basically non-compliant systems and so we have to catch the various forms of non-compliance.
I don't have any better suggestions, other than that this should be described somehow via ACPI (and not in vendor-specific stuff like PNP0D80).
Well, it isn't in practice.
On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
On Tue, Aug 22, 2023 at 12:42 AM Bjorn Helgaas helgaas@kernel.org wrote:
On Fri, Aug 18, 2023 at 02:39:32PM -0500, Mario Limonciello wrote:
commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") changed pci_bridge_d3_possible() so that any vendor's PCIe ports from modern machines (>=2015) are allowed to be put into D3.
Iain reports that USB devices can't be used to wake a Lenovo Z13 from suspend. This is because the PCIe root port has been put into D3 and AMD's platform can't handle USB devices waking in this case.
This behavior is only reported on Linux. Comparing the behavior on Windows and Linux, Windows doesn't put the root ports into D3.
To fix the issue without regressing existing Intel systems, limit the >=2015 check to only apply to Intel PCIe ports.
@@ -3037,10 +3037,15 @@ bool pci_bridge_d3_possible(struct pci_dev *bridge) return false;
/*
* It should be safe to put PCIe ports from 2015 or newer
* to D3.
* Allow Intel PCIe ports from 2015 onward to go into D3 to
* achieve additional energy conservation on some platforms.
*
* This is only set for Intel PCIe ports as it causes problems
* on both AMD Rembrandt and Phoenix platforms where USB keyboards
* can not be used to wake the system from suspend. */
if (dmi_get_bios_year() >= 2015)
if (bridge->vendor == PCI_VENDOR_ID_INTEL &&
dmi_get_bios_year() >= 2015) return true;
Hmm. I'm really not a fan of checks like this that aren't connected to an actual property of the platform. The Intel Vendor ID tells us nothing about what the actual problem is, which makes it really hard to maintain in the future. It's also very AMD- and Intel-centric, when this code is ostensibly arch-agnostic, so this potentially regresses ARM64, RISC-V, powerpc, etc.
That's a fair point.
Would it be better to reverse this and filter out AMD systems as they are affected by the existing check?
Since we're trying to avoid an issue on AMD systems, I would definitely prefer to have the code change mention AMD instead of Intel.
It's bad enough that we check for 2015. A BIOS security update to a 2014 platform will break things,
Well, not necessarily. Pre-2015 systems already worked and the check was added as "surely, everything 2015 or newer should work either". While it is true that putting PCIe Root Ports into D3hot was necessary for extra energy conservation on Intel systems, it actually has been expected to work everywhere.
This is a tangent; I was just trying to make the point that the date check means a BIOS update may change Linux behavior even if the update has nothing to do with PM, and I think that's a bad thing even if the new behavior is not a failure. But this is water under the bridge and is probably not going to cause problems in the future.
even though the update has nothing to do with D3. We're stuck with that one, and it's old enough that maybe it won't bite us any more, but I hate to add more.
Well, how would you like to deal with the systems that don't work today, because they expect a different behavior?
Effectively, the current behavior for all modern systems is to allow bridge D3 if there are no indications that it shouldn't be allowed. The platforms in question assume the reverse, so what else can be done?
The list of conditions in pci_bridge_d3_possible() is a pretty good clue that we don't really know what we're doing, and all we can do is find configurations that happen to work.
Yes, because by the spec it all should work just fine. The PCI PM 1.2 specification defines the expected behavior for bridges and the PCIe specification claims to be a superset of that.
What we need to deal with here is basically non-compliant systems and so we have to catch the various forms of non-compliance.
Thanks for this, that helps. If pci_bridge_d3_possible() is a list of quirks for systems that are known to be broken (or at least not known to work correctly and avoiding D3 is acceptable), then we should document and use it that way.
The current documentation ("checks if it is possible to move to D3") frames it as "does the bridge have the required features?" instead of "do we know about something broken in this bridge or this platform?"
If something is broken, I would expect tests based on the device or DMI check. But several some are not obvious defects. E.g., "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what defect are we finding there? What does the spec require that isn't happening?
In this particular patch, apparently we assume any non-Intel port or any BIOS before 2015 is broken. Obviously way too general. We know "USB keyboards don't wake from suspend," but I think we need something at the PCI level like "PME interrupt doesn't happen when bridge is in state X" (i.e., the part that is non-compliant), and "one consequence is that downstream devices can't wake from suspend."
I don't have any better suggestions, other than that this should be described somehow via ACPI (and not in vendor-specific stuff like PNP0D80).
Well, it isn't in practice.
If this is basically quirks and we treat it that way (comments about the breakage and references to what the spec violations are), maybe this is the best we can do.
Bjorn
On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
What we need to deal with here is basically non-compliant systems and so we have to catch the various forms of non-compliance.
Thanks for this, that helps. If pci_bridge_d3_possible() is a list of quirks for systems that are known to be broken (or at least not known to work correctly and avoiding D3 is acceptable), then we should document and use it that way.
The current documentation ("checks if it is possible to move to D3") frames it as "does the bridge have the required features?" instead of "do we know about something broken in this bridge or this platform?"
If something is broken, I would expect tests based on the device or DMI check. But several some are not obvious defects. E.g., "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what defect are we finding there? What does the spec require that isn't happening?
This particular check doesn't pertain to a defect, but indeed follows from the spec:
If hotplug control wasn't granted to the OS, the OS shall not put the hotplug port in D3 behind firmware's back because the power state affects accessibility of devices downstream of the hotplug port.
Put another way, the firmware expects to have control of hotplug and hotplug may break if the OS fiddles with the power state of the hotplug port.
Here's a bugzilla where this caused issues: https://bugzilla.kernel.org/show_bug.cgi?id=53811
On the other hand Thunderbolt hotplug ports are required to runtime suspend to D3 in order to save power. On Macs they're always handled natively by the OS. Hence the code comment.
A somewhat longer explanation I gave in 2016: https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/
Perhaps the code comment preceding that check can be rephrased to convey its meaning more clearly...
Thanks,
Lukas
On Wed, Aug 23, 2023 at 07:04:53AM +0200, Lukas Wunner wrote:
On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
What we need to deal with here is basically non-compliant systems and so we have to catch the various forms of non-compliance.
Thanks for this, that helps. If pci_bridge_d3_possible() is a list of quirks for systems that are known to be broken (or at least not known to work correctly and avoiding D3 is acceptable), then we should document and use it that way.
The current documentation ("checks if it is possible to move to D3") frames it as "does the bridge have the required features?" instead of "do we know about something broken in this bridge or this platform?"
If something is broken, I would expect tests based on the device or DMI check. But several some are not obvious defects. E.g., "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what defect are we finding there? What does the spec require that isn't happening?
This particular check doesn't pertain to a defect, but indeed follows from the spec:
If hotplug control wasn't granted to the OS, the OS shall not put the hotplug port in D3 behind firmware's back because the power state affects accessibility of devices downstream of the hotplug port.
Put another way, the firmware expects to have control of hotplug and hotplug may break if the OS fiddles with the power state of the hotplug port.
Here's a bugzilla where this caused issues: https://bugzilla.kernel.org/show_bug.cgi?id=53811
On the other hand Thunderbolt hotplug ports are required to runtime suspend to D3 in order to save power.
Sounds like there may be a requirement in a Thunderbolt spec about this, so maybe we could add that citation? I guess this goes with the "bridge->is_thunderbolt" check?
On Macs they're always handled natively by the OS. Hence the code comment.
And I guess this goes with the "System Management Mode" and "Thunderbolt on non-Macs" comments? A citation to the source behind "OS shall not put the hotplug port in D3 behind firmware's back" would be super helpful here.
A somewhat longer explanation I gave in 2016: https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/
Perhaps the code comment preceding that check can be rephrased to convey its meaning more clearly...
Thanks! I think it would be worth trying to separate out the "normal" things that correspond to the spec from the "quirk" things that work around defects. That's not material for *this* patch, though.
It's also a little weird that pci_bridge_d3_possible() itself looks like it's invariant for the life of the system, but we call it several times (pci_pm_init(), pci_bridge_d3_update(), pcie_portdrv_probe(), etc). I guess this is because we save the result in dev->bridge_d3, but then pci_bridge_d3_update() updates dev->bridge_d3 based on other things, so the original value is lost. Maybe another bit or two could avoid those extra calls.
Bjorn
On Wed, Aug 23, 2023 at 06:46:19AM -0500, Bjorn Helgaas wrote:
On Wed, Aug 23, 2023 at 07:04:53AM +0200, Lukas Wunner wrote:
On Tue, Aug 22, 2023 at 07:02:43PM -0500, Bjorn Helgaas wrote:
On Tue, Aug 22, 2023 at 12:11:10PM +0200, Rafael J. Wysocki wrote:
What we need to deal with here is basically non-compliant systems and so we have to catch the various forms of non-compliance.
Thanks for this, that helps. If pci_bridge_d3_possible() is a list of quirks for systems that are known to be broken (or at least not known to work correctly and avoiding D3 is acceptable), then we should document and use it that way.
The current documentation ("checks if it is possible to move to D3") frames it as "does the bridge have the required features?" instead of "do we know about something broken in this bridge or this platform?"
If something is broken, I would expect tests based on the device or DMI check. But several some are not obvious defects. E.g., "bridge->is_hotplug_bridge && !pciehp_is_native(bridge)" -- what defect are we finding there? What does the spec require that isn't happening?
This particular check doesn't pertain to a defect, but indeed follows from the spec:
If hotplug control wasn't granted to the OS, the OS shall not put the hotplug port in D3 behind firmware's back because the power state affects accessibility of devices downstream of the hotplug port.
Put another way, the firmware expects to have control of hotplug and hotplug may break if the OS fiddles with the power state of the hotplug port.
Here's a bugzilla where this caused issues: https://bugzilla.kernel.org/show_bug.cgi?id=53811
On the other hand Thunderbolt hotplug ports are required to runtime suspend to D3 in order to save power.
Sounds like there may be a requirement in a Thunderbolt spec about this, so maybe we could add that citation? I guess this goes with the "bridge->is_thunderbolt" check?
Right, that's the check I was referring to. But I'm afraid there is no explicit rule in Thunderbolt / USB4 specs that hotplug ports must be runtime suspended in order to save power, at least to the best of my knowledge.
In practice, Thunderbolt controllers come in one of two forms, discrete or integrated into the CPU. Originally only discrete controllers existed, but over time it became more and more common to integrate them.
Apple was pretty much the only vendor which sold larger quantities of Thunderbolt 1 and 2 chips in the 2010 to 2016 era. Back in the day, only discrete controllers existed and they consumed around 1.5 to 2 W. On laptops, that's a significant amount of energy, so from day 1, Apple put load switches on their motherboards which allowed the Thunderbolt controllers to be powered down if nothing is plugged in.
With the *integrated* Thunderbolt controllers, powering down on idle is usually likewise required to allow the entire CPU package to enter a low power state.
To the operating system, a Thunderbolt controller is visible as a PCIe switch with an NHI device below one of the Downstream Ports and hotplugged devices appearing below the other Downstream Ports. The NHI is a vendor-agnostic Native Host Interface for tunnel setup, similar to the OHCI, EHCI, XHCI interface definitions that are in use with USB and FireWire controllers.
Linux uses a hierarchical power management model, i.e. parent devices cannot runtime suspend unless their children runtime suspend. Thus, the hotplug ports need to runtime suspend and then the Switch Upstream Port can runtime suspend and that triggers powerdown of the controller.
To cut a long story short, in *practice* Thunderbolt hotplug ports need to be put into D3hot in order to save power, so that's what we do. And we know from experience that they're all *safe* to be put into D3hot. Hence we're whitelisting them in pci_bridge_d3_possible().
By contrast, I recall that we got MCEs on Xeon-SP processors back in the day when their Root Ports were put into D3hot. Hence the rather conservative approach taken in pci_bridge_d3_possible() to whitelist only known-good, newer hardware.
On Macs they're always handled natively by the OS. Hence the code comment.
And I guess this goes with the "System Management Mode" and "Thunderbolt on non-Macs" comments? A citation to the source behind "OS shall not put the hotplug port in D3 behind firmware's back" would be super helpful here.
So I've just looked through the PCI Firmware Spec and can't find that mentioned anywhere explicitly, but it's pretty obvious if you think about it:
If the OS puts the hotplug port in D3hot, its downstream bus transitions to either B2 or B3 (PCI Power Management Spec r1.2 sec 4.7.1).
Which means devices downstream of the hotplug port become inaccessible. If hotplug control wasn't granted to the OS, firmware expects it may handle the hotplug port without interference by the OS.
If the OS fiddles with the hotplug port's power state, that expectation is no longer met. Let's say the firmware reads the downstream device's Vendor ID register to probe whether the device is there. That'll no longer work as expected once the hotplug port is transitioned to D3hot by the operating system.
In retrospect, this code comment is probably confusing:
/* * Hotplug ports handled by firmware in System Management Mode * may not be put into D3 by the OS (Thunderbolt on non-Macs). */ if (bridge->is_hotplug_bridge && !pciehp_is_native(bridge)) return false;
This is not in any way specific to Thunderbolt. It's just that back in the day, Thunderbolt tunnel management was done natively on Macs, whereas non-Macs did it in firmware. That has since changed and most vendors have adopted native tunnel management. So the code comment is outdated. The following would probably be more accurate today:
/* * Hotplug ports handled by firmware in System Management Mode * may not be put into D3 by the OS (behind firmware's back). */
A somewhat longer explanation I gave in 2016: https://lore.kernel.org/all/20160617213209.GA1927@wunner.de/
Perhaps the code comment preceding that check can be rephrased to convey its meaning more clearly...
Thanks! I think it would be worth trying to separate out the "normal" things that correspond to the spec from the "quirk" things that work around defects. That's not material for *this* patch, though.
It's also a little weird that pci_bridge_d3_possible() itself looks like it's invariant for the life of the system, but we call it several times (pci_pm_init(), pci_bridge_d3_update(), pcie_portdrv_probe(), etc). I guess this is because we save the result in dev->bridge_d3, but then pci_bridge_d3_update() updates dev->bridge_d3 based on other things, so the original value is lost. Maybe another bit or two could avoid those extra calls.
Right on all accounts. Those invocations of pci_bridge_d3_possible() are all in code paths which run only once, e.g. on enumeration and removal of the PCIe port and on shutdown. We figured that it's not worth it to cache the return value of pci_bridge_d3_possible() for these few invocations, none of which are in hot paths. ("We" is mostly Mika and yours truly, who introduced this for Thunderbolt power management.)
Thanks,
Lukas
linux-stable-mirror@lists.linaro.org