On Mon, 14 Oct 2024 at 23:46, Nicolin Chen nicolinc@nvidia.com wrote:
On Mon, Oct 14, 2024 at 03:58:55PM +0800, Zhangfei Gao wrote:
+struct iommufd_object *iommufd_object_alloc_elm(struct iommufd_ctx *ictx,
size_t size,
enum iommufd_object_type type)
+{
struct iommufd_object *obj;
int rc;
obj = kzalloc(size, GFP_KERNEL_ACCOUNT);
if (!obj)
return ERR_PTR(-ENOMEM);
obj->type = type;
/* Starts out bias'd by 1 until it is removed from the xarray */
refcount_set(&obj->shortterm_users, 1);
refcount_set(&obj->users, 1);
here set refcont 1
iommufd_device_bind -> iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE): refcont -> 1 refcount_inc(&idev->obj.users); refcount -> 2 will cause iommufd_device_unbind fail.
May remove refcount_inc(&idev->obj.users) in iommufd_device_bind
Hmm, why would it fail? Or is it failing on your system?
Not sure, still in check, it may only be on my platform.
it hit iommufd_object_remove: if (WARN_ON(obj != to_destroy))
iommufd_device_bind refcount=2 iommufd_device_attach refcount=3 //still not sure which operation inc the count? iommufd_device_detach refcount=4
Have a question, when should iommufd_vdevice_destroy be called, before or after iommufd_device_unbind.
Before.
Now iommufd_vdevice_destroy (ref--) is after unbind, hits the if (!refcount_dec_if_one(&obj->users)) check.
Hmm, where do we have an iommufd_vdevice_destroy after unbind?
Looks like it is called by close fd? [ 2480.909319] iommufd_vdevice_destroy+0xdc/0x168 [ 2480.909324] iommufd_fops_release+0xa4/0x140 [ 2480.909328] __fput+0xd0/0x2e0 [ 2480.909331] ____fput+0x1c/0x30 [ 2480.909333] task_work_run+0x78/0xe0 [ 2480.909337] do_exit+0x2fc/0xa50 [ 2480.909340] do_group_exit+0x3c/0xa0 [ 2480.909344] get_signal+0x96c/0x978 [ 2480.909346] do_signal+0x94/0x3a8 [ 2480.909348] do_notify_resume+0x100/0x190
iommufd_device_bind iommufd_device_attach iommufd_vdevice_alloc_ioctl
iommufd_device_detach iommufd_device_unbind // refcount check fail iommufd_vdevice_destroy ref--
Things should be symmetric. As you suspected, vdevice should be destroyed before iommufd_device_detach.
I am trying based on your for_iommufd_viommu_p2-v3 branch, do you have this issue? In checking whether close fd before unbind?
A vdev is an object on top of a vIOMMU obj and an idev obj, so it takes a refcount from each of them. That's why idev couldn't unbind.
Thanks
Thanks Nicolin