From: Jason Gunthorpe jgg@nvidia.com Sent: Saturday, February 25, 2023 8:28 AM
- /*
* All objects using a group reference must put it before their destroy
* completes
*/
- new_igroup->ictx = ictx;
Looks the comment is not related to the code. Probably put it in the destroy function?
- /*
* We dropped the lock so igroup is invalid. Assume that the
* xa had NULL in it, if this guess is wrong then we will obtain
* the actual value under lock and try again once.
this reads like "if this guess is wrong" is checked outside of the lock. I'd just remove "under lock".
*/
- cur_igroup = NULL;
- xa_lock(&ictx->groups);
- while (true) {
igroup = __xa_cmpxchg(&ictx->groups, id, cur_igroup,
new_igroup,
GFP_KERNEL);
if (xa_is_err(igroup)) {
xa_unlock(&ictx->groups);
iommufd_put_group(new_igroup);
return ERR_PTR(xa_err(igroup));
}
/* new_group was successfully installed */
if (cur_igroup == igroup) {
xa_unlock(&ictx->groups);
return new_igroup;
}
/* Check again if the current group is any good */
if (igroup && igroup->group == group &&
kref_get_unless_zero(&igroup->ref)) {
xa_unlock(&ictx->groups);
iommufd_put_group(new_igroup);
return igroup;
}
cur_igroup = igroup;
- }
Add a WARN_ON(igroup->group != group). The only valid race should be when an existing group which is created by another device in the same iommu group is being destroyed then we want another try hoping that object will be removed from xarray soon. But there should not be a case with the same slot pointing to a different iommu group.
@@ -98,7 +195,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, /* The calling driver is a user until iommufd_device_unbind() */ refcount_inc(&idev->obj.users); /* group refcount moves into iommufd_device */
- idev->group = group;
- idev->igroup = igroup;
the comment about group refcount is stale now.