Re: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()

15 Feb 2023


      On Wed, Feb 15, 2023 at 02:15:59AM +0000, Tian, Kevin wrote:
...
External email: Use caution opening links or attachments
...
From: Nicolin Chen nicolinc@nvidia.com
Sent: Wednesday, February 15, 2023 9:59 AM
On Wed, Feb 15, 2023 at 01:37:19AM +0000, Tian, Kevin wrote:
...
...
From: Nicolin Chen nicolinc@nvidia.com
Sent: Tuesday, February 14, 2023 6:54 PM
On Mon, Feb 13, 2023 at 10:49:34AM -0400, Jason Gunthorpe wrote:
...
On Sun, Feb 12, 2023 at 11:48:30PM -0800, Nicolin Chen wrote:
...
What about point 1? If dev2 and dev3 are already replaced when
doing iommu_group_replace_domain() on dev1, their idev objects
still have the old hwpt/iopt until user space does another two
IOCTLs on them, right?
We have a complicated model for multi-device groups...
The first device in the group to change domains must move all the
devices in the group
But userspace is still expected to run through and change all the
other devices
So replace should be a NOP if the group is already linked to the right
domain.
This patch needs to make sure that incosistency in the view betwen the
iommu_group and the iommufd model doesn't cause a functional
problem.
Yea, I was thinking that we'd need to block any access to the
idev->hwpt of a pending device's, before the kernel finishes
the "NOP" IOCTL from userspace, maybe with a helper:
      (iommu_get_domain_for_dev(idev->dev) != idev->hwpt->domain)
I didn't see what would be broken w/o such blocking measure.
Can you elaborate?
iommu_group { idev1, idev2 }
(1) Attach all devices to domain1 {
      group->domain = domain1;
      idev1->hwpt = hwpt1; // domain1
      idev2->hwpt = hwpt1; // domain1
}
(2) Attach (replace) idev1 only to domain2 {
      group->domain = domain2
      idev1->hwpt = hwpt2; // domain2
      idev2->hwpt == domain1 // != iommu_get_domain_for_dev()
}
Then if user space isn't aware of these and continues to do
IOMMU_IOAS_MAP for idev2. Though IOVA mappings may be added
onto the domain2 correctly, yet domain2's iopt itree won't
reflect that, until idev2->hwpt is updated too, right?
IOMMU_IOAS_MAP is for ioas instead of for device.
even w/o any device attached we still allow ioas map.
also note in case of domain1 still actively attached to idev3 in
another group, banning IOAS_MAP due to idev2 in transition
is also incorrect for idev3.
OK. That's likely true. It doesn't seem to be correct to block an
IOMMU_IOAS_MAP.
But things will be out of control, if user space continues mapping
something onto domain1's iopt for idev2, even after it is attached
covertly to domain2's iopt by the replace routine. I wonder how
kernel should handle this and keep the consistency between IOMMUFD
objects and iommu_group.
...
...
And the tricky thing is that, though we advocate a device-
centric uAPI, we'd still have to ask user space to align the
devices in the same iommu_group, which is also not present
in QEMU's RFC v3 series.
IMHO this requirement has been stated clearly from the start
when designing this device centric interface. QEMU should be
improved to take care of it.
OK. It has to be so...
Thanks
Nic

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v2 08/10] iommufd/device: Use iommu_group_replace_domain()