From: Jason Gunthorpe jgg@nvidia.com Sent: Wednesday, February 15, 2023 8:53 PM
On Wed, Feb 15, 2023 at 06:10:47AM +0000, Tian, Kevin wrote:
From: Nicolin Chen nicolinc@nvidia.com Sent: Wednesday, February 8, 2023 5:18 AM
+int iommu_group_replace_domain(struct iommu_group *group,
struct iommu_domain *new_domain)
+{
- int ret;
- if (!new_domain)
return -EINVAL;
- mutex_lock(&group->mutex);
- ret = __iommu_group_set_domain(group, new_domain);
- if (ret)
__iommu_group_set_domain(group, group->domain);
Just realize the error unwind is a nop given below:
__iommu_group_set_domain() { if (group->domain == new_domain) return 0;
...
There was an attempt [1] to fix error unwind in iommu_attach_group(), by temporarily set group->domain to NULL before calling set_domain().
Jason, I wonder why this recovering cannot be done in __iommu_group_set_domain() directly, e.g.:
ret = __iommu_group_for_each_dev(group, new_domain, iommu_group_do_attach_device); if (ret) { __iommu_group_for_each_dev(group, group->domain, iommu_group_do_attach_device); return ret; } group->domain = new_domain;
We talked about this already, some times this is not the correct recovery case, eg if we are going to a blocking domain we need to drop all references to the prior domain, not put them back.
Failures are WARN_ON events not error recovery.
OK, I remember that. Then here looks we also need temporarily set group->domain to NULL before calling set_domain() to recover, as [1] does.
[1] https://lore.kernel.org/linux-iommu/20230215052642.6016-1-vasant.hegde@amd.c...