On Thu, Nov 14, 2024 at 12:20:10PM -0400, Jason Gunthorpe wrote:
On Wed, Nov 13, 2024 at 07:18:42PM -0800, Nicolin Chen wrote:
so the user would try to create vDevices with a given viommu_obj until failure, then it would allocate another viommu_obj for the failed device. is it? sounds reasonable.
Yes. It is the same as previously dealing with a nesting parent: test and allocate if fails. The virtual IOMMU driver in VMM can keep a list of the vIOMMU objects for each device to test.
The viommu object should be tied to the VMM's vIOMMU vHW object that it is paravirtualizing toward the VM.
So we shouldn't be creating viommu objects on demand, it should be created when the vIOMMU is created, and the presumably the qemu command line will describe how to link vPCI/VFIO functions to vIOMMU instances. If they kernel won't allow the user's configuration then it should fail, IMHO.
Intel's virtual IOMMU in QEMU has one instance but could create two vIOMMU objects for devices behind two different pIOMMUs. So, in this case, it does the on-demand (or try-and-fail) approach?
One corner case that Yi reminded me of was that VMM having two virtual IOMMUs for two devices that are behind the same pIOMMU, then these two virtual IOMMUs don't necessarily share the same vIOMMU object, i.e. VMM is allowed to allocate two vIOMMU objs?
Some try-and-fail might be interesting to auto-provision vIOMMU's and provision vPCI functions. Though I suspect we will be providing information in other ioctls so something like libvirt can construct the correct configuration directly.
By "auto-provision", you mean libvirt assigning devices to the correct virtual IOMMUs corresponding to the physical instances? If so, we can just match the "iommu" sysfs node of devices with the iommu node(s) under /sys/class/iommu/, right?
Thanks Nicolin