Hi all,
On 4/17/24 10:40 PM, Bjorn Helgaas wrote:
From: Bjorn Helgaas bhelgaas@google.com
Arul, Mateusz, Imcarneiro91, and Aman reported a regression caused by 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map"). On the Lenovo Legion 9i laptop, that commit removes the area containing ECAM from E820, which means the early E820 validation started failing, which meant we didn't enable ECAM in the "early MCFG" path
The lack of ECAM caused many ACPI methods to fail, resulting in the embedded controller, PS/2, audio, trackpad, and battery devices not being detected. The _OSC method also failed, so Linux could not take control of the PCIe hotplug, PME, and AER features:
# pci_mmcfg_early_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] PCI: not using ECAM ([mem 0xc0000000-0xce0fffff] not reserved)
ACPI Error: AE_ERROR, Returned by Handler for [PCI_Config] (20230628/evregion-300) ACPI: Interpreter enabled ACPI: Ignoring error and continuing table load ACPI BIOS Error (bug): Could not resolve symbol [_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) ACPI: Skipping parse of AML opcode: OpcodeName unavailable (0x0010) ACPI BIOS Error (bug): Could not resolve symbol [_SB.PC00.RP01._SB.PC00], AE_NOT_FOUND (20230628/dswload2-162) ACPI Error: AE_NOT_FOUND, During name lookup/catalog (20230628/psobject-220) ... ACPI Error: Aborting method _SB.PC00._OSC due to previous error (AE_NOT_FOUND) (20230628/psparse-529) acpi PNP0A08:00: _OSC: platform retains control of PCIe features (AE_NOT_FOUND)
# pci_mmcfg_late_init()
PCI: ECAM [mem 0xc0000000-0xce0fffff] (base 0xc0000000) for domain 0000 [bus 00-e0] PCI: [Firmware Info]: ECAM [mem 0xc0000000-0xce0fffff] not reserved in ACPI motherboard resources PCI: ECAM [mem 0xc0000000-0xce0fffff] is EfiMemoryMappedIO; assuming valid PCI: ECAM [mem 0xc0000000-0xce0fffff] reserved to work around lack of ACPI motherboard _CRS
Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be reserved by a PNP0C02 resource, but it need not be mentioned in E820, so we shouldn't look at E820 to validate the ECAM space described by MCFG.
946f2ee5c731 ("[PATCH] i386/x86-64: Check that MCFG points to an e820 reserved area") added a sanity check of E820 to work around buggy MCFG tables, but that over-aggressive validation causes failures like this one.
Keep the E820 validation check only for older BIOSes (pre-2016) so the buggy 2006-era machines don't break. Skip the early E820 check for 2016 and newer BIOSes.
I know a fix for this has been long in the making so I don't want to throw a spanner into the works, but I wonder why is the is_efi_mmio() check inside the if (!early && !acpi_disabled) {} block (before this patch) ?
is_efi_mmio() only relies on EFI memdescriptors and those are setup pretty early. Assuming that the EFI memdescriptors are indeed setup before pci_mmcfg_reserved(..., ..., early=true) gets called we could simply move the is_efi_mmio(&cfg->res) outside (below) the if (!early && !acpi_disabled) {} so that it always runs before the is_mmconf_reserved(e820__mapped_all, ...) check.
Looking at the dmesg above the is_efi_mmio() check does succeed, so this should fix the issue without needing a BIOS year check ?
Regards,
Hans
Fixes: 07eab0901ede ("efi/x86: Remove EfiMemoryMappedIO from E820 map") Reported-by: Mateusz Kaduk mateusz.kaduk@gmail.com Reported-by: Arul <...> Reported-by: Imcarneiro91 <...> Reported-by: Aman <...> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218444 Signed-off-by: Bjorn Helgaas bhelgaas@google.com Tested-by: Mateusz Kaduk mateusz.kaduk@gmail.com Cc: stable@vger.kernel.org
arch/x86/pci/mmconfig-shared.c | 35 +++++++++++++++++++++++++++------- 1 file changed, 28 insertions(+), 7 deletions(-)
diff --git a/arch/x86/pci/mmconfig-shared.c b/arch/x86/pci/mmconfig-shared.c index 0cc9520666ef..53c7afa606c3 100644 --- a/arch/x86/pci/mmconfig-shared.c +++ b/arch/x86/pci/mmconfig-shared.c @@ -518,7 +518,34 @@ static bool __ref pci_mmcfg_reserved(struct device *dev, { struct resource *conflict;
- if (!early && !acpi_disabled) {
- if (early) {
/*
* Don't try to do this check unless configuration type 1
* is available. How about type 2?
*/
/*
* 946f2ee5c731 ("Check that MCFG points to an e820
* reserved area") added this E820 check in 2006 to work
* around BIOS defects.
*
* Per PCI Firmware r3.3, sec 4.1.2, ECAM space must be
* reserved by a PNP0C02 resource, but it need not be
* mentioned in E820. Before the ACPI interpreter is
* available, we can't check for PNP0C02 resources, so
* there's no reliable way to verify the region in this
* early check. Keep it only for the old machines that
* motivated 946f2ee5c731.
*/
if (dmi_get_bios_year() < 2016 && raw_pci_ops)
return is_mmconf_reserved(e820__mapped_all, cfg, dev,
"E820 entry");
return true;
- }
- if (!acpi_disabled) { if (is_mmconf_reserved(is_acpi_reserved, cfg, dev, "ACPI motherboard resource")) return true;
@@ -554,12 +581,6 @@ static bool __ref pci_mmcfg_reserved(struct device *dev, if (pci_mmcfg_running_state) return true;
- /* Don't try to do this check unless configuration
type 1 is available. how about type 2 ?*/
- if (raw_pci_ops)
return is_mmconf_reserved(e820__mapped_all, cfg, dev,
"E820 entry");
- return false;
}