On Fri, Feb 18, 2022 at 8:26 PM Johannes Berg johannes@sipsolutions.net wrote:
On Fri, 2022-02-18 at 15:57 +0800, David Gow wrote:
Note that, while this does build again, it still segfaults on startup, so more work remains to be done.
That's probably just a lot more stuff getting included somehow?
Yeah: it used to work (a couple of years ago), but something has broken it in the meantime. It's just a shame that bisecting things with allyesconfig takes so long...
They are:
- CONFIG_VFIO_PCI: Needs ioport_map/ioport_unmap.
- CONFIG_INFINIBAND_RDMAVT: Needs cpuinfo_x86 and __copy_user_nocache
- CONFIG_BNXT: Failing under UML with -Werror
ERROR:root:../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c: In function ‘bnxt_ptp_enable’: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:400:43: error: array subscript 255 is above array bounds of ‘struct pps_pin[4]’ [-Werror=array-bounds] 400 | ptp->pps_info.pins[pin_id].event = BNXT_PPS_EVENT_EXTERNAL; | ~~~~~~~~~~~~~~~~~~^~~~~~~~
- CONFIG_PATA_CS5535: Needs MSR access (__tracepoint_{read,write}_msr)
- CONFIG_VDPA: Enables CONFIG_DMA_OPS, which is unimplemented. ('dma_ops' is not defined)
These are all issues which should be investigated properly and the corresponding options either fixed or disabled under UML. Having this list of broken options should act as a good to-do list here, and will allow these issues to be worked on independently, and other tests to work in the meantime.
I'm not really sure it makes sense to even do anything other than disabling these.
It looks like all of them are just exposed by now being able to build PCI drivers on UML. Surely the people writing the driver didn't expect their drivers to run over simulated PCI (which is what the UML PCI support is all about).
Now from a PCI driver point of view you can't really tell the difference (and anyway the driver won't be probed), but the issues (at least the build time ones) come from having
UML && PCI && X86_64
or
UML && PCI && X86_32
because drivers typically depend on X86_64 or X86_32, rather than on "X86 && X86_64" or "X86 && X86_32". In a sense thus, the issue is those drivers don't know that "!X86 && (X86_32 || X86_64)" can happen (with UML).
Now you could say that's the driver bug, or you could say that they should just add "depends on !UML" (though that's basically equivalent to adding "depends on X86" and the latter may be preferable in some cases).
Or actually in the three patches you have (1-3) it's in the code, but same thing, you can either add && !UML (like you did) or add && X86.
I didn't realise X86 wasn't defined in UML: that's definitely a bit cleaner than !UML in a number of these cases.
Not all of those issues are fundamentally solved by "depends on X86", though: there are a few which might be other missing things in UML (maybe the 'dma_ops' issues?), and/or might be the result of -Werror being enabled.
Arguably, however, building PCI drivers by default is somewhat questionable in the first place?
We do want the ability to build PCI drivers under UML, as it makes running KUnit tests for PCI drivers much simpler and more pleasant. And indeed, it does work for KUnit in general, it's just that some drivers do have the issues mentioned above, so allyesconfig picks up every broken driver.
We don't actually build the PCI drivers by default, only if the "--alltests" option is passed, which does include them, as we do have tests which depend on PCI we'd like to run (like the thunderbolt test).
So maybe you should just add
# CONFIG_UML_PCI_OVER_VIRTIO is not set
to the broken_on_uml.config since it exposes all these issues, and really is not very useful since you're not going to actually run with any simulated PCI devices anyway, so drivers will not be probed.
I did try this as well, and it just got us a different set of issues (there are a bunch of drivers which depend on IOMEM but don't state it -- I'll try to send fixes for those out next week). And we'd lose things like the thunderbolt test if PCI
Ultimately, the 'broken_on_uml.config' file is just there to pare back allyesconfig a bit for KUnit's purposes, but we still definitely want as many options (and hence tests) enabled as possible long-term. So I think actual fixes to either the code or Kconfig do make sense.
Is 'make ARCH=um allyesconfig' something we actually want to be able to build? If so, no amount of adding things to KUnit's broken_on_uml.config will solve the underlying issues, and we'll need to at least update the Kconfig entries.
Cheers, -- David