From: Nicolin Chen nicolinc@nvidia.com Sent: Wednesday, February 8, 2023 5:18 AM
iommu_group_replace_domain() is introduced to support use cases where an iommu_group can be attached to a new domain without getting detached from the old one. This replacement feature will be useful, for cases such as:
- vPASID mode, when a guest wants to replace a single pasid (PASID=0) table with a larger table (PASID=N)
- Nesting mode, when switching the attaching device from an S2 domain to an S1 domain, or when switching between relevant S1 domains.
as it allows these cases to switch seamlessly without a DMA disruption.
So, call iommu_group_replace_domain() in the iommufd_device_do_attach(). And add a __iommmufd_device_detach helper to allow the replace routine to do a partial detach on the current hwpt that's being replaced. Though the updated locking logic is overcomplicated, it will be eased, once those iopt_table_add/remove_ioas and list_add/del calls are moved to hwpt's allocation/destroy() functions in the coming nesting series, as that'll depend on a new ->domain_alloc_user op in the iommu core.
then why not moving those changes into this series to make it simple?
Also, block replace operations that are from/to auto_domains, i.e. only user-allocated hw_pagetables can be replaced or replaced with.
where does this restriction come from? iommu_group_replace_domain() can switch between any two UNMANAGED domains. What is the extra problem in iommufd to support from/to auto domains?
Signed-off-by: Nicolin Chen nicolinc@nvidia.com
drivers/iommu/iommufd/device.c | 101 +++++++++++++++++------- drivers/iommu/iommufd/iommufd_private.h | 2 + 2 files changed, 76 insertions(+), 27 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index b8c3e3baccb5..8a9834fc129a 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -9,6 +9,8 @@ #include "io_pagetable.h" #include "iommufd_private.h"
+MODULE_IMPORT_NS(IOMMUFD_INTERNAL);
static bool allow_unsafe_interrupts; module_param(allow_unsafe_interrupts, bool, S_IRUGO | S_IWUSR); MODULE_PARM_DESC( @@ -194,9 +196,61 @@ static bool iommufd_hw_pagetable_has_group(struct iommufd_hw_pagetable *hwpt, return false; }
+/**
- __iommmufd_device_detach - Detach a device from idev->hwpt to
new_hwpt
'from ... to ...' means a replace semantics. then this should be called iommufd_device_replace_hwpt().
- @idev: device to detach
- @new_hwpt: new hw_pagetable to attach (pass in NULL for a simple
detach)
- @detach_group: flag to call iommu_detach_group
- This is a cleanup helper shared by the replace and detach routines.
Comparing
- to a detach routine, a replace routine only needs a partial detach
procedure:
- it does not need the iommu_detach_group(); it will attach the device to a
new
- hw_pagetable after a partial detach from the currently attached
hw_pagetable,
- so certain steps can be skipped if two hw_pagetables have the same IOAS.
- */
+static void __iommmufd_device_detach(struct iommufd_device *idev,
struct iommufd_hw_pagetable
*new_hwpt,
bool detach_group)
+{
- struct iommufd_hw_pagetable *hwpt = idev->hwpt;
- struct iommufd_ioas *new_ioas = NULL;
- if (new_hwpt)
new_ioas = new_hwpt->ioas;
- mutex_lock(&hwpt->devices_lock);
- list_del(&idev->devices_item);
- if (hwpt->ioas != new_ioas)
mutex_lock(&hwpt->ioas->mutex);
I think new_ioas->mutext was meant here.
- if (!iommufd_hw_pagetable_has_group(hwpt, idev->group)) {
if (list_empty(&hwpt->devices)) {
iopt_table_remove_domain(&hwpt->ioas->iopt,
hwpt->domain);
list_del(&hwpt->hwpt_item);
}
I'm not sure how this can be fully shared between detach and replace. Here some work e.g. above needs to be done before calling iommu_group_replace_domain() while others can be done afterwards.