From: Baolu Lu baolu.lu@linux.intel.com Sent: Wednesday, March 8, 2023 10:08 AM
On 3/7/23 8:46 PM, Jason Gunthorpe wrote:
On Tue, Mar 07, 2023 at 08:42:06AM +0000, Tian, Kevin wrote:
From: Jason Gunthorpejgg@nvidia.com Sent: Saturday, February 25, 2023 8:28 AM
[...]
The implementation is complicated because we have to introduce some per-iommu_group memory in iommufd and redo how we think about
multi-
device groups to be more explicit. This solves all the locking problems in the prior attempts.
Now think about the pasid case.
pasid attach is managed as a device operation today: iommu_attach_device_pasid()
Following it naturally we'll have a pasid array per iommufd_device to track attached HWPT per pasid.
But internally there is only one pasid table per iommu group. i.e. same story as RID attach that once dev1 replaces hwpt on pasid1 then it takes effect on all other devices in the same group.
IMHO I can't belive that any actual systems that support PASID have a RID aliasing problem too.
I think we should fix the iommu core to make PASID per-device and require systems that have a RID aliasing problem to block PASID.
This is a bigger picture, if drivers have to optionally share their PASID tables with other drivers then we can't have per-driver PASID allocators at all either.
This is actually required in PCI and IOMMU core. pci_enable_pasid() requires full ACS support on device's upstream path:
if (!pci_acs_path_enabled(pdev, NULL, PCI_ACS_RR | PCI_ACS_UF)) return -EINVAL;
and, for such PCI topology, iommu core always allocates an exclusive iommu group.
The only place where seems to be a little messy is,
static int __iommu_set_group_pasid(struct iommu_domain *domain, struct iommu_group *group, ioasid_t pasid) { struct group_device *device; int ret = 0;
list_for_each_entry(device, &group->devices, list) { ret = domain->ops->set_dev_pasid(domain, device->dev,
pasid); if (ret) break; }
return ret;
}
yes, this is exactly where my confusion comes. I thought we ever agreed that PASID usage won't be supported on multi-devices group. Then I saw about code (but didn't catch the implication of pci acs) which led me to think about the group mess.
sticking to per-device pasid would certainly simplify the logic in iommufd. 😊
Perhaps we need a check on singleton group?
or move the xarray to device? Storing it in group while adding a singleton restriction only causes confusion.