On Thu, Feb 09, 2023 at 04:00:52AM +0000, Tian, Kevin wrote:
From: Nicolin Chen nicolinc@nvidia.com Sent: Wednesday, February 8, 2023 5:18 AM
iommu_group_replace_domain() is introduced to support use cases where an iommu_group can be attached to a new domain without getting detached from the old one. This replacement feature will be useful, for cases such as:
- vPASID mode, when a guest wants to replace a single pasid (PASID=0) table with a larger table (PASID=N)
- Nesting mode, when switching the attaching device from an S2 domain to an S1 domain, or when switching between relevant S1 domains.
as it allows these cases to switch seamlessly without a DMA disruption.
So, call iommu_group_replace_domain() in the iommufd_device_do_attach(). And add a __iommmufd_device_detach helper to allow the replace routine to do a partial detach on the current hwpt that's being replaced. Though the updated locking logic is overcomplicated, it will be eased, once those iopt_table_add/remove_ioas and list_add/del calls are moved to hwpt's allocation/destroy() functions in the coming nesting series, as that'll depend on a new ->domain_alloc_user op in the iommu core.
then why not moving those changes into this series to make it simple?
The simplification depends on the new ->domain_alloc_user op and its implementation in SMMU driver, which would be introduced by the nesting series of VT-d and SMMU respectively.
At this point, it's hard to decide the best sequence of our three series. Putting this replace series first simply because it seems to be closer to get merged than the other two bigger series.
Also, block replace operations that are from/to auto_domains, i.e. only user-allocated hw_pagetables can be replaced or replaced with.
where does this restriction come from? iommu_group_replace_domain() can switch between any two UNMANAGED domains. What is the extra problem in iommufd to support from/to auto domains?
It was my misunderstanding. We should have supported that. Will fix in v3 and add the corresponding support.
+/**
- __iommmufd_device_detach - Detach a device from idev->hwpt to
new_hwpt
'from ... to ...' means a replace semantics. then this should be called iommufd_device_replace_hwpt().
Had a hard time to think about the naming, it's indeed a detach- only routine, but it takes a parameter named new_hwpt...
Perhaps I should just follow Yi's suggestion by rephrasing the narrative of this function.
+static void __iommmufd_device_detach(struct iommufd_device *idev,
struct iommufd_hw_pagetable
*new_hwpt,
bool detach_group)
+{
struct iommufd_hw_pagetable *hwpt = idev->hwpt;
struct iommufd_ioas *new_ioas = NULL;
if (new_hwpt)
new_ioas = new_hwpt->ioas;
mutex_lock(&hwpt->devices_lock);
list_del(&idev->devices_item);
if (hwpt->ioas != new_ioas)
mutex_lock(&hwpt->ioas->mutex);
I think new_ioas->mutext was meant here.
new_hwpt is an input from an replace routine, where it holds the new_ioas->mutex already. Yi's right that the code here is a bit confusing. I will try to change it a bit for readability.
if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
if (list_empty(&hwpt->devices)) {
iopt_table_remove_domain(&hwpt->ioas->iopt,
hwpt->domain);
list_del(&hwpt->hwpt_item);
}
I'm not sure how this can be fully shared between detach and replace. Here some work e.g. above needs to be done before calling iommu_group_replace_domain() while others can be done afterwards.
This iopt_table_remove_domain/list_del is supposed to be done in the hwpt's destroy() actually. We couldn't move it because it'd need the new domain_alloc_user op and its implementation in ARM driver. Overall, I think it should be safe to put it behind the iommu_group_replace_domain().
Thanks Nic