On Wed, 16 Oct 2024 at 14:52, Nicolin Chen nicolinc@nvidia.com wrote:
On Wed, Oct 16, 2024 at 09:56:51AM +0800, Zhangfei Gao wrote:
On Wed, 16 Oct 2024 at 02:44, Nicolin Chen nicolinc@nvidia.com wrote:
On Mon, Oct 14, 2024 at 07:01:40PM -0700, Nicolin Chen wrote:
On Tue, Oct 15, 2024 at 09:15:01AM +0800, Zhangfei Gao wrote:
> iommufd_device_bind > iommufd_device_attach > iommufd_vdevice_alloc_ioctl > > iommufd_device_detach > iommufd_device_unbind // refcount check fail > iommufd_vdevice_destroy ref--
Things should be symmetric. As you suspected, vdevice should be destroyed before iommufd_device_detach.
I am trying based on your for_iommufd_viommu_p2-v3 branch, do you have this issue? In checking whether close fd before unbind?
Oops, my bad. I will provide a fix.
This should fix the problem:
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 5fd3dd420290..13100cfea29d 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -277,6 +277,11 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); */ void iommufd_device_unbind(struct iommufd_device *idev) {
mutex_lock(&idev->igroup->lock);
/* idev->vdev object should be destroyed prior, yet just in case.. */
if (idev->vdev)
iommufd_object_remove(idev->ictx, NULL, idev->vdev->obj.id, 0);
mutex_unlock(&idev->igroup->lock); iommufd_object_destroy_user(idev->ictx, &idev->obj);
} EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD);
Not yet [ 574.162112] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000004 [ 574.261102] pc : iommufd_object_remove+0x7c/0x278 [ 574.265795] lr : iommufd_device_unbind+0x44/0x98 in check
Hmm, it's kinda odd it crashes inside iommufd_object_remove(). Did you happen to change something there?
The added iommufd_object_remove() is equivalent to userspace calling the destroy ioctl on the vDEVICE object.
Yes, double confirmed, it can solve the issue. The guest can stop and run again
The Null pointer may be caused by the added debug.
Thanks Nico.
Nicolin