On Tue, 22 Oct 2024 18:43:24 -0700 Dan Williams dan.j.williams@intel.com wrote:
When the CXL subsystem is built-in the module init order is determined by Makefile order. That order violates expectations. The expectation is that cxl_acpi and cxl_mem can race to attach and that if cxl_acpi wins the race cxl_mem will find the enabled CXL root ports it needs and if cxl_acpi loses the race it will retrigger cxl_mem to attach via cxl_bus_rescan(). That only works if cxl_acpi can assume ports are enabled immediately upon cxl_acpi_probe() return. That in turn can only happen in the CONFIG_CXL_ACPI=y case if the cxl_port object appears before the cxl_acpi object in the Makefile.
Fix up the order to prevent initialization failures, and make sure that cxl_port is built-in if cxl_acpi is also built-in.
As for what contributed to this not being found earlier, the CXL regression environment, cxl_test, builds all CXL functionality as a module to allow to symbol mocking and other dynamic reload tests. As a result there is no regression coverage for the built-in case.
Reported-by: Gregory Price gourry@gourry.net Closes: http://lore.kernel.org/20241004212504.1246-1-gourry@gourry.net Tested-by: Gregory Price gourry@gourry.net Fixes: 8dd2bc0f8e02 ("cxl/mem: Add the cxl_mem driver")
I don't like this due to likely long term fragility, but any other solution is probably more painful. Long term we should really get a regression test for these ordering issues in place in one of the CIs.
Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com
Cc: stable@vger.kernel.org Cc: Davidlohr Bueso dave@stgolabs.net Cc: Jonathan Cameron jonathan.cameron@huawei.com Cc: Dave Jiang dave.jiang@intel.com Cc: Alison Schofield alison.schofield@intel.com Cc: Vishal Verma vishal.l.verma@intel.com Cc: Ira Weiny ira.weiny@intel.com Signed-off-by: Dan Williams dan.j.williams@intel.com
drivers/cxl/Kconfig | 1 + drivers/cxl/Makefile | 20 ++++++++++++++------ 2 files changed, 15 insertions(+), 6 deletions(-)
diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig index 29c192f20082..876469e23f7a 100644 --- a/drivers/cxl/Kconfig +++ b/drivers/cxl/Kconfig @@ -60,6 +60,7 @@ config CXL_ACPI default CXL_BUS select ACPI_TABLE_LIB select ACPI_HMAT
- select CXL_PORT help Enable support for host managed device memory (HDM) resources published by a platform's ACPI CXL memory layout description. See
diff --git a/drivers/cxl/Makefile b/drivers/cxl/Makefile index db321f48ba52..2caa90fa4bf2 100644 --- a/drivers/cxl/Makefile +++ b/drivers/cxl/Makefile @@ -1,13 +1,21 @@ # SPDX-License-Identifier: GPL-2.0
+# Order is important here for the built-in case: +# - 'core' first for fundamental init +# - 'port' before platform root drivers like 'acpi' so that CXL-root ports +# are immediately enabled +# - 'mem' and 'pmem' before endpoint drivers so that memdevs are +# immediately enabled +# - 'pci' last, also mirrors the hardware enumeration hierarchy obj-y += core/ -obj-$(CONFIG_CXL_PCI) += cxl_pci.o -obj-$(CONFIG_CXL_MEM) += cxl_mem.o +obj-$(CONFIG_CXL_PORT) += cxl_port.o obj-$(CONFIG_CXL_ACPI) += cxl_acpi.o obj-$(CONFIG_CXL_PMEM) += cxl_pmem.o -obj-$(CONFIG_CXL_PORT) += cxl_port.o +obj-$(CONFIG_CXL_MEM) += cxl_mem.o +obj-$(CONFIG_CXL_PCI) += cxl_pci.o -cxl_mem-y := mem.o -cxl_pci-y := pci.o +cxl_port-y := port.o cxl_acpi-y := acpi.o cxl_pmem-y := pmem.o security.o -cxl_port-y := port.o +cxl_mem-y := mem.o +cxl_pci-y := pci.o