kunit_tool's --alltests option uses UML and make allyesconfig to produce a configuration which enables as many tests as possible. However, make ARCH=um allyesconfig is broken for a number of reasons.
Fix a few different UML build breakages, and disable a few config options in kunit_tool in order to get this kernel compiling again.
Note that the resulting kernel still doesn't run, but having it compile is the first step to fixing that.
David Gow (3): drm/amdgpu: Make smu7_hwmgr build on UML IB/qib: Compile under User-Mode Linux kunit: tool: Disable broken options for --alltests
Randy Dunlap (1): drm/amdgpu: Fix compilation under UML
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +- drivers/infiniband/hw/qib/qib_wc_x86_64.c | 4 ++++ tools/testing/kunit/configs/broken_on_uml.config | 5 +++++ 5 files changed, 14 insertions(+), 5 deletions(-)
From: Randy Dunlap rdunlap@infradead.org
cpuinfo_x86 and its associated macros are not available under ARCH=um, even though CONFIG_X86_64 is defined.
This patch (and discussion) were originally posted here: https://lkml.org/lkml/2022/1/24/1547
This produces the following build errors: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1556:9: note: in expansion of macro ‘cpu_data’ return cpu_data(first_cpu_of_numa_node).apicid; ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1560:1: error: control reaches end of non-void function [-Werror=return-type]
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_fill_iolink_info_for_cpu’: ../arch/um/include/asm/processor-generic.h:103:19: error: called object is not a function or function pointer #define cpu_data (&boot_cpu_data) ~^~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1688:27: note: in expansion of macro ‘cpu_data’ struct cpuinfo_x86 *c = &cpu_data(0); ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:7: error: dereferencing pointer to incomplete type ‘struct cpuinfo_x86’ if (c->x86_vendor == X86_VENDOR_AMD) ^~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:23: error: ‘X86_VENDOR_AMD’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’? if (c->x86_vendor == X86_VENDOR_AMD) ^~~~~~~~~~~~~~ X86_VENDOR_ANY
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_create_vcrat_image_cpu’: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1742:11: warning: unused variable ‘entries’ [-Wunused-variable] uint32_t entries = 0;
Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: David Gow davidgow@google.com --- drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 9624bbe8b501..b1e2d117be3d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1682,7 +1682,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, int *avail_size, return 0; }
-#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) static int kfd_fill_iolink_info_for_cpu(int numa_node_id, int *avail_size, uint32_t *num_entries, struct crat_subtype_iolink *sub_type_hdr) @@ -1741,7 +1741,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) struct crat_subtype_generic *sub_type_hdr; int avail_size = *size; int numa_node_id; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) uint32_t entries = 0; #endif int ret = 0; @@ -1806,7 +1806,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) sub_type_hdr->length);
/* Fill in Subtype: IO Link */ -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) ret = kfd_fill_iolink_info_for_cpu(numa_node_id, &avail_size, &entries, (struct crat_subtype_iolink *)sub_type_hdr); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 948fbb39336e..b38fc530ffe2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1552,7 +1552,7 @@ static int kfd_cpumask_to_apic_id(const struct cpumask *cpumask) first_cpu_of_numa_node = cpumask_first(cpumask); if (first_cpu_of_numa_node >= nr_cpu_ids) return -1; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) return cpu_data(first_cpu_of_numa_node).apicid; #else return first_cpu_of_numa_node;
Am 2022-02-18 um 02:57 schrieb David Gow:
From: Randy Dunlap rdunlap@infradead.org
cpuinfo_x86 and its associated macros are not available under ARCH=um, even though CONFIG_X86_64 is defined.
This patch (and discussion) were originally posted here: https://lkml.org/lkml/2022/1/24/1547
This produces the following build errors: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1556:9: note: in expansion of macro ‘cpu_data’ return cpu_data(first_cpu_of_numa_node).apicid; ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1560:1: error: control reaches end of non-void function [-Werror=return-type]
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_fill_iolink_info_for_cpu’: ../arch/um/include/asm/processor-generic.h:103:19: error: called object is not a function or function pointer #define cpu_data (&boot_cpu_data) ~^~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1688:27: note: in expansion of macro ‘cpu_data’ struct cpuinfo_x86 *c = &cpu_data(0); ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:7: error: dereferencing pointer to incomplete type ‘struct cpuinfo_x86’ if (c->x86_vendor == X86_VENDOR_AMD) ^~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:23: error: ‘X86_VENDOR_AMD’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’? if (c->x86_vendor == X86_VENDOR_AMD) ^~~~~~~~~~~~~~ X86_VENDOR_ANY
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_create_vcrat_image_cpu’: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1742:11: warning: unused variable ‘entries’ [-Wunused-variable] uint32_t entries = 0;
Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: David Gow davidgow@google.com
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 9624bbe8b501..b1e2d117be3d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1682,7 +1682,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, int *avail_size, return 0; } -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
I don't think it makes sense to compile a hardware device driver in a UML config. Instead of scattering UML #ifdefs through our code, I would recommend adding this to Kconfig:
--- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -254,7 +254,7 @@ source "drivers/gpu/drm/radeon/Kconfig"
config DRM_AMDGPU tristate "AMD GPU" - depends on DRM && PCI && MMU + depends on DRM && PCI && MMU && !UML select FW_LOADER select DRM_KMS_HELPER select DRM_SCHED
That would address patch 2 of this series as well.
Regards, Felix
static int kfd_fill_iolink_info_for_cpu(int numa_node_id, int *avail_size, uint32_t *num_entries, struct crat_subtype_iolink *sub_type_hdr) @@ -1741,7 +1741,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) struct crat_subtype_generic *sub_type_hdr; int avail_size = *size; int numa_node_id; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) uint32_t entries = 0; #endif int ret = 0; @@ -1806,7 +1806,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) sub_type_hdr->length); /* Fill in Subtype: IO Link */ -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) ret = kfd_fill_iolink_info_for_cpu(numa_node_id, &avail_size, &entries, (struct crat_subtype_iolink *)sub_type_hdr); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 948fbb39336e..b38fc530ffe2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1552,7 +1552,7 @@ static int kfd_cpumask_to_apic_id(const struct cpumask *cpumask) first_cpu_of_numa_node = cpumask_first(cpumask); if (first_cpu_of_numa_node >= nr_cpu_ids) return -1; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) return cpu_data(first_cpu_of_numa_node).apicid; #else return first_cpu_of_numa_node;
On Fri, Feb 18, 2022 at 11:39 AM Felix Kuehling felix.kuehling@amd.com wrote:
Am 2022-02-18 um 02:57 schrieb David Gow:
From: Randy Dunlap rdunlap@infradead.org
cpuinfo_x86 and its associated macros are not available under ARCH=um, even though CONFIG_X86_64 is defined.
This patch (and discussion) were originally posted here: https://lkml.org/lkml/2022/1/24/1547
This produces the following build errors: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1556:9: note: in expansion of macro ‘cpu_data’ return cpu_data(first_cpu_of_numa_node).apicid; ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1560:1: error: control reaches end of non-void function [-Werror=return-type]
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_fill_iolink_info_for_cpu’: ../arch/um/include/asm/processor-generic.h:103:19: error: called object is not a function or function pointer #define cpu_data (&boot_cpu_data) ~^~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1688:27: note: in expansion of macro ‘cpu_data’ struct cpuinfo_x86 *c = &cpu_data(0); ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:7: error: dereferencing pointer to incomplete type ‘struct cpuinfo_x86’ if (c->x86_vendor == X86_VENDOR_AMD) ^~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:23: error: ‘X86_VENDOR_AMD’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’? if (c->x86_vendor == X86_VENDOR_AMD) ^~~~~~~~~~~~~~ X86_VENDOR_ANY
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_create_vcrat_image_cpu’: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1742:11: warning: unused variable ‘entries’ [-Wunused-variable] uint32_t entries = 0;
Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: David Gow davidgow@google.com
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 9624bbe8b501..b1e2d117be3d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1682,7 +1682,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, int *avail_size, return 0; }
-#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
I don't think it makes sense to compile a hardware device driver in a UML config. Instead of scattering UML #ifdefs through our code, I would recommend adding this to Kconfig:
--- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -254,7 +254,7 @@ source "drivers/gpu/drm/radeon/Kconfig"
config DRM_AMDGPU tristate "AMD GPU"
depends on DRM && PCI && MMU
depends on DRM && PCI && MMU && !UML select FW_LOADER select DRM_KMS_HELPER select DRM_SCHED
That would address patch 2 of this series as well.
I agree. Otherwise, we are always going to be chasing these issues.
Alex
Regards, Felix
static int kfd_fill_iolink_info_for_cpu(int numa_node_id, int *avail_size, uint32_t *num_entries, struct crat_subtype_iolink *sub_type_hdr) @@ -1741,7 +1741,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) struct crat_subtype_generic *sub_type_hdr; int avail_size = *size; int numa_node_id; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) uint32_t entries = 0; #endif int ret = 0; @@ -1806,7 +1806,7 @@ static int kfd_create_vcrat_image_cpu(void *pcrat_image, size_t *size) sub_type_hdr->length);
/* Fill in Subtype: IO Link */
-#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) ret = kfd_fill_iolink_info_for_cpu(numa_node_id, &avail_size, &entries, (struct crat_subtype_iolink *)sub_type_hdr); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c index 948fbb39336e..b38fc530ffe2 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c @@ -1552,7 +1552,7 @@ static int kfd_cpumask_to_apic_id(const struct cpumask *cpumask) first_cpu_of_numa_node = cpumask_first(cpumask); if (first_cpu_of_numa_node >= nr_cpu_ids) return -1; -#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML) return cpu_data(first_cpu_of_numa_node).apicid; #else return first_cpu_of_numa_node;
On Sat, Feb 19, 2022 at 12:39 AM Felix Kuehling felix.kuehling@amd.com wrote:
Am 2022-02-18 um 02:57 schrieb David Gow:
From: Randy Dunlap rdunlap@infradead.org
cpuinfo_x86 and its associated macros are not available under ARCH=um, even though CONFIG_X86_64 is defined.
This patch (and discussion) were originally posted here: https://lkml.org/lkml/2022/1/24/1547
This produces the following build errors: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1556:9: note: in expansion of macro ‘cpu_data’ return cpu_data(first_cpu_of_numa_node).apicid; ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_topology.c:1560:1: error: control reaches end of non-void function [-Werror=return-type]
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_fill_iolink_info_for_cpu’: ../arch/um/include/asm/processor-generic.h:103:19: error: called object is not a function or function pointer #define cpu_data (&boot_cpu_data) ~^~~~~~~~~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1688:27: note: in expansion of macro ‘cpu_data’ struct cpuinfo_x86 *c = &cpu_data(0); ^~~~~~~~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:7: error: dereferencing pointer to incomplete type ‘struct cpuinfo_x86’ if (c->x86_vendor == X86_VENDOR_AMD) ^~ ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1691:23: error: ‘X86_VENDOR_AMD’ undeclared (first use in this function); did you mean ‘X86_VENDOR_ANY’? if (c->x86_vendor == X86_VENDOR_AMD) ^~~~~~~~~~~~~~ X86_VENDOR_ANY
../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c: In function ‘kfd_create_vcrat_image_cpu’: ../drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_crat.c:1742:11: warning: unused variable ‘entries’ [-Wunused-variable] uint32_t entries = 0;
Signed-off-by: Randy Dunlap rdunlap@infradead.org Signed-off-by: David Gow davidgow@google.com
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 +++--- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c index 9624bbe8b501..b1e2d117be3d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_crat.c @@ -1682,7 +1682,7 @@ static int kfd_fill_mem_info_for_cpu(int numa_node_id, int *avail_size, return 0; }
-#ifdef CONFIG_X86_64 +#if defined(CONFIG_X86_64) && !defined(CONFIG_UML)
I don't think it makes sense to compile a hardware device driver in a UML config. Instead of scattering UML #ifdefs through our code, I would recommend adding this to Kconfig:
There are cases where I think it could make sense to have a hardware driver under UML, though I agree they're pretty rare.
In particular, there's the virtio PCI support in UML, which one could potentially hook up a GPU to. The case I care more about is the ability to run KUnit tests under UML: if amdgpu wanted to have KUnit tests, it could still run them in qemu (or on real hardware), but UML is faster and more convenient, if the code being tested can compile under it.
So I have a slight preference personally for fixing this, to unblock those uses.
That being said, it's definitely not worth placing a significant burden on you maintaining these things if no-one uses them. And since there doesn't appear to be any such use at the moment[1], I've no strong objection to just disabling this for now (it can always be re-enabled and fixed if it becomes useful later).
Cheers, -- David
[1] The proposed DRM KUnit tests don't require any actual hardware drivers, as I understand it: https://lore.kernel.org/all/20220117232259.180459-5-michal.winiarski@intel.c...
The User-Mode-Linux architecture (with the x86_64 subarch) defines CONFIG_X86_64, but doesn't expose the cpuinfo_x86 struct (instead there's a cpuinfo_um struct).
In order to allow UML to build with allyesconfig, only check cpuinfo_x86 on non-UML architectures.
Fixes: b3dc549986 ("mdgpu: Disable PCIE_DPM on Intel RKL Platform") Signed-off-by: David Gow davidgow@google.com --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c index a1e11037831a..a162552f7845 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c @@ -1738,7 +1738,7 @@ static int smu7_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
static bool intel_core_rkl_chk(void) { -#if IS_ENABLED(CONFIG_X86_64) +#if IS_ENABLED(CONFIG_X86_64) && !defined(CONFIG_UML) struct cpuinfo_x86 *c = &cpu_data(0);
return (c->x86 == 6 && c->x86_model == INTEL_FAM6_ROCKETLAKE);
The User-Mode-Linux architecture (with the x86_64 subarch) defines CONFIG_X86_64, but doesn't expose the cpuinfo_x86 struct (instead there's a cpuinfo_um struct).
In order to allow UML to build with allyesconfig, only check cpuinfo_x86 on non-UML architectures.
Signed-off-by: David Gow davidgow@google.com --- drivers/infiniband/hw/qib/qib_wc_x86_64.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/infiniband/hw/qib/qib_wc_x86_64.c b/drivers/infiniband/hw/qib/qib_wc_x86_64.c index edd0ddbd4481..76fef1321c26 100644 --- a/drivers/infiniband/hw/qib/qib_wc_x86_64.c +++ b/drivers/infiniband/hw/qib/qib_wc_x86_64.c @@ -146,5 +146,9 @@ void qib_disable_wc(struct qib_devdata *dd) */ int qib_unordered_wc(void) { +#ifndef CONFIG_UML return boot_cpu_data.x86_vendor != X86_VENDOR_AMD; +#else + return 0; +#endif }
There are a number of Kconfig options which break compilation under UML with allyesconfig. As kunit_tool's --alltests option is based on allyesconfig and UML, we need to update the list of broken options to make --alltests build again.
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
They are: - CONFIG_VFIO_PCI: Needs ioport_map/ioport_unmap. - CONFIG_INFINIBAND_RDMAVT: Needs cpuinfo_x86 and __copy_user_nocache - CONFIG_BNXT: Failing under UML with -Werror ERROR:root:../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c: In function ‘bnxt_ptp_enable’: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:400:43: error: array subscript 255 is above array bounds of ‘struct pps_pin[4]’ [-Werror=array-bounds] 400 | ptp->pps_info.pins[pin_id].event = BNXT_PPS_EVENT_EXTERNAL; | ~~~~~~~~~~~~~~~~~~^~~~~~~~ - CONFIG_PATA_CS5535: Needs MSR access (__tracepoint_{read,write}_msr) - CONFIG_VDPA: Enables CONFIG_DMA_OPS, which is unimplemented. ('dma_ops' is not defined)
These are all issues which should be investigated properly and the corresponding options either fixed or disabled under UML. Having this list of broken options should act as a good to-do list here, and will allow these issues to be worked on independently, and other tests to work in the meantime.
Signed-off-by: David Gow davidgow@google.com --- tools/testing/kunit/configs/broken_on_uml.config | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/tools/testing/kunit/configs/broken_on_uml.config b/tools/testing/kunit/configs/broken_on_uml.config index 690870043ac0..546482b0bc4d 100644 --- a/tools/testing/kunit/configs/broken_on_uml.config +++ b/tools/testing/kunit/configs/broken_on_uml.config @@ -42,3 +42,8 @@ # CONFIG_ADI_AXI_ADC is not set # CONFIG_DEBUG_PAGEALLOC is not set # CONFIG_PAGE_POISONING is not set +# CONFIG_VFIO_PCI is not set +# CONFIG_INFINIBAND_RDMAVT is not set +# CONFIG_BNXT is not set +# CONFIG_PATA_CS5535 is not set +# CONFIG_VDPA is not set
On Fri, 2022-02-18 at 15:57 +0800, David Gow wrote:
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
That's probably just a lot more stuff getting included somehow?
They are:
- CONFIG_VFIO_PCI: Needs ioport_map/ioport_unmap.
- CONFIG_INFINIBAND_RDMAVT: Needs cpuinfo_x86 and __copy_user_nocache
- CONFIG_BNXT: Failing under UML with -Werror
ERROR:root:../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c: In function ‘bnxt_ptp_enable’: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:400:43: error: array subscript 255 is above array bounds of ‘struct pps_pin[4]’ [-Werror=array-bounds] 400 | ptp->pps_info.pins[pin_id].event = BNXT_PPS_EVENT_EXTERNAL; | ~~~~~~~~~~~~~~~~~~^~~~~~~~
- CONFIG_PATA_CS5535: Needs MSR access (__tracepoint_{read,write}_msr)
- CONFIG_VDPA: Enables CONFIG_DMA_OPS, which is unimplemented. ('dma_ops' is not defined)
These are all issues which should be investigated properly and the corresponding options either fixed or disabled under UML. Having this list of broken options should act as a good to-do list here, and will allow these issues to be worked on independently, and other tests to work in the meantime.
I'm not really sure it makes sense to even do anything other than disabling these.
It looks like all of them are just exposed by now being able to build PCI drivers on UML. Surely the people writing the driver didn't expect their drivers to run over simulated PCI (which is what the UML PCI support is all about).
Now from a PCI driver point of view you can't really tell the difference (and anyway the driver won't be probed), but the issues (at least the build time ones) come from having
UML && PCI && X86_64
or
UML && PCI && X86_32
because drivers typically depend on X86_64 or X86_32, rather than on "X86 && X86_64" or "X86 && X86_32". In a sense thus, the issue is those drivers don't know that "!X86 && (X86_32 || X86_64)" can happen (with UML).
Now you could say that's the driver bug, or you could say that they should just add "depends on !UML" (though that's basically equivalent to adding "depends on X86" and the latter may be preferable in some cases).
Or actually in the three patches you have (1-3) it's in the code, but same thing, you can either add && !UML (like you did) or add && X86.
Arguably, however, building PCI drivers by default is somewhat questionable in the first place?
So maybe you should just add
# CONFIG_UML_PCI_OVER_VIRTIO is not set
to the broken_on_uml.config since it exposes all these issues, and really is not very useful since you're not going to actually run with any simulated PCI devices anyway, so drivers will not be probed.
johannes
On Fri, Feb 18, 2022 at 8:26 PM Johannes Berg johannes@sipsolutions.net wrote:
On Fri, 2022-02-18 at 15:57 +0800, David Gow wrote:
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
That's probably just a lot more stuff getting included somehow?
Yeah: it used to work (a couple of years ago), but something has broken it in the meantime. It's just a shame that bisecting things with allyesconfig takes so long...
They are:
- CONFIG_VFIO_PCI: Needs ioport_map/ioport_unmap.
- CONFIG_INFINIBAND_RDMAVT: Needs cpuinfo_x86 and __copy_user_nocache
- CONFIG_BNXT: Failing under UML with -Werror
ERROR:root:../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c: In function ‘bnxt_ptp_enable’: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:400:43: error: array subscript 255 is above array bounds of ‘struct pps_pin[4]’ [-Werror=array-bounds] 400 | ptp->pps_info.pins[pin_id].event = BNXT_PPS_EVENT_EXTERNAL; | ~~~~~~~~~~~~~~~~~~^~~~~~~~
- CONFIG_PATA_CS5535: Needs MSR access (__tracepoint_{read,write}_msr)
- CONFIG_VDPA: Enables CONFIG_DMA_OPS, which is unimplemented. ('dma_ops' is not defined)
These are all issues which should be investigated properly and the corresponding options either fixed or disabled under UML. Having this list of broken options should act as a good to-do list here, and will allow these issues to be worked on independently, and other tests to work in the meantime.
I'm not really sure it makes sense to even do anything other than disabling these.
It looks like all of them are just exposed by now being able to build PCI drivers on UML. Surely the people writing the driver didn't expect their drivers to run over simulated PCI (which is what the UML PCI support is all about).
Now from a PCI driver point of view you can't really tell the difference (and anyway the driver won't be probed), but the issues (at least the build time ones) come from having
UML && PCI && X86_64
or
UML && PCI && X86_32
because drivers typically depend on X86_64 or X86_32, rather than on "X86 && X86_64" or "X86 && X86_32". In a sense thus, the issue is those drivers don't know that "!X86 && (X86_32 || X86_64)" can happen (with UML).
Now you could say that's the driver bug, or you could say that they should just add "depends on !UML" (though that's basically equivalent to adding "depends on X86" and the latter may be preferable in some cases).
Or actually in the three patches you have (1-3) it's in the code, but same thing, you can either add && !UML (like you did) or add && X86.
I didn't realise X86 wasn't defined in UML: that's definitely a bit cleaner than !UML in a number of these cases.
Not all of those issues are fundamentally solved by "depends on X86", though: there are a few which might be other missing things in UML (maybe the 'dma_ops' issues?), and/or might be the result of -Werror being enabled.
Arguably, however, building PCI drivers by default is somewhat questionable in the first place?
We do want the ability to build PCI drivers under UML, as it makes running KUnit tests for PCI drivers much simpler and more pleasant. And indeed, it does work for KUnit in general, it's just that some drivers do have the issues mentioned above, so allyesconfig picks up every broken driver.
We don't actually build the PCI drivers by default, only if the "--alltests" option is passed, which does include them, as we do have tests which depend on PCI we'd like to run (like the thunderbolt test).
So maybe you should just add
# CONFIG_UML_PCI_OVER_VIRTIO is not set
to the broken_on_uml.config since it exposes all these issues, and really is not very useful since you're not going to actually run with any simulated PCI devices anyway, so drivers will not be probed.
I did try this as well, and it just got us a different set of issues (there are a bunch of drivers which depend on IOMEM but don't state it -- I'll try to send fixes for those out next week). And we'd lose things like the thunderbolt test if PCI
Ultimately, the 'broken_on_uml.config' file is just there to pare back allyesconfig a bit for KUnit's purposes, but we still definitely want as many options (and hence tests) enabled as possible long-term. So I think actual fixes to either the code or Kconfig do make sense.
Is 'make ARCH=um allyesconfig' something we actually want to be able to build? If so, no amount of adding things to KUnit's broken_on_uml.config will solve the underlying issues, and we'll need to at least update the Kconfig entries.
Cheers, -- David
On Sat, 2022-02-19 at 16:00 +0800, David Gow wrote:
On Fri, Feb 18, 2022 at 8:26 PM Johannes Berg johannes@sipsolutions.net wrote:
On Fri, 2022-02-18 at 15:57 +0800, David Gow wrote:
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
That's probably just a lot more stuff getting included somehow?
Yeah: it used to work (a couple of years ago), but something has broken it in the meantime. It's just a shame that bisecting things with allyesconfig takes so long...
Heh, right.
But I guess you could "Kconfig bisect" first, i.e. see what option breaks it? It might not even help to bisect, if it's just some option getting enabled over time. Or perhaps the kernel is just too big for the address space layout if you have allyesconfig? Though that shouldn't be an issue, I think.
I didn't realise X86 wasn't defined in UML:
X86 is the architecture, X86_{32,64} is kind of a selection for how you want things to be built, and it's thus required for UML on x86, because UML imports stuff from X86.
that's definitely a bit cleaner than !UML in a number of these cases.
It looks like some (most?) of them don't really work that way though since they're not really platform specific, they just know only about a handful of platforms that they're compatible with.
Not all of those issues are fundamentally solved by "depends on X86", though: there are a few which might be other missing things in UML (maybe the 'dma_ops' issues?), and/or might be the result of -Werror being enabled.
Right.
We do want the ability to build PCI drivers under UML, as it makes running KUnit tests for PCI drivers much simpler and more pleasant.
OK, fair point. I'm thinking about this area in general also right now for iwlwifi, and obviously we're probably the only user of the virtual PCI code that lets us connect the driver to a simulated device on UML (but the driver doesn't really know) :-)
And indeed, it does work for KUnit in general, it's just that some drivers do have the issues mentioned above, so allyesconfig picks up every broken driver.
Right.
We don't actually build the PCI drivers by default, only if the "--alltests" option is passed, which does include them, as we do have tests which depend on PCI we'd like to run (like the thunderbolt test).
Makes sense.
I did try this as well, and it just got us a different set of issues (there are a bunch of drivers which depend on IOMEM but don't state it -- I'll try to send fixes for those out next week).
Fun.
Ultimately, the 'broken_on_uml.config' file is just there to pare back allyesconfig a bit for KUnit's purposes, but we still definitely want as many options (and hence tests) enabled as possible long-term. So I think actual fixes to either the code or Kconfig do make sense.
Makes sense.
Is 'make ARCH=um allyesconfig' something we actually want to be able to build? If so, no amount of adding things to KUnit's broken_on_uml.config will solve the underlying issues, and we'll need to at least update the Kconfig entries.
That's a good point, as long as people are doing allyes/randconfig builds on UML, we probably need to have these fixes anyway rather than disabling something for KUnit specifically.
johannes
On Fri, Feb 18, 2022 at 03:57:27PM +0800, David Gow wrote:
There are a number of Kconfig options which break compilation under UML with allyesconfig. As kunit_tool's --alltests option is based on allyesconfig and UML, we need to update the list of broken options to make --alltests build again.
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
They are:
- CONFIG_VFIO_PCI: Needs ioport_map/ioport_unmap.
- CONFIG_INFINIBAND_RDMAVT: Needs cpuinfo_x86 and __copy_user_nocache
It doesn't make sense to patch qib and then turn this option off, it is required to build qib.
Jason
linux-kselftest-mirror@lists.linaro.org