Alejandro Lucero Palau wrote:
Hi Dan,
I think this is the same issue one of the patches in type2 support tries to deal with:
https://lore.kernel.org/linux-cxl/20240907081836.5801-1-alejandro.lucero-pal...
If this fixes that situation, I guess I can drop that one from v4 which is ready to be sent.
The other problem I try to fix in that patch, the endpoint not being there when that code tries to use it, it is likely not needed either, although I have a trivial fix for it now instead of that ugly loop with delays. The solution is to add PROBE_FORCE_SYNCHRONOUS as probe_type for the cxl_mem_driver which implies the device_add will only return when the device is really created. Maybe that is worth it for other potential situations suffering the delayed creation.
I am skeptical that PROBE_FORCE_SYNCRONOUS is a fix for any device-readiness bug. Some other assumption is violated if that is required.
For the type-2 case I did have an EPROBE_DEFER in my initial RFC on the assumption that an accelerator driver might want to wait until CXL is initialized before the base accelerator proceeds. However, if accelerator drivers behave the same as the cxl_pci driver and are ok with asynchronus arrival of CXL functionality then no deferral is needed.
Otherwise, the only motivation for synchronous probing I can think of would be to have more predictable naming of kernel objects. So yes, I would be curious to understand what scenarios probe deferral is still needed.