On Wed, May 28, 2025 at 02:17:54PM -0300, Jason Gunthorpe wrote:
On Sat, May 17, 2025 at 08:21:27PM -0700, Nicolin Chen wrote:
The new HW QUEUE object will be added for HW to access the guest queue for HW-accelerated virtualization feature. Some of HW QUEUEs are designed in a way of accessing the guest queue via a host physical address without doing a translation using the nesting parent IO page table, while others can use the guest physical address. For the former case, kernel working with a VMM needs to pin the physical pages backing the guest memory to lock them when HW QUEUE is accessing, and to ensure those physical pages to be contiguous in the physical address space.
This is very like the existing iommufd_access_pin_pages() that outputs the pinned page list for the caller to test its contiguity.
Move those code from iommufd_access_pin/unpin_pages() and related function for a pair of iopt helpers that can be shared with the HW QUEUE allocator.
Rename check_area_prot() to align with the existing iopt_area helpers, and inline it to the header since iommufd_access_rw() still uses it.
Reviewed-by: Pranjal Shrivastava praan@google.com Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com
drivers/iommu/iommufd/io_pagetable.h | 8 ++ drivers/iommu/iommufd/iommufd_private.h | 6 ++ drivers/iommu/iommufd/device.c | 119 ++---------------------- drivers/iommu/iommufd/io_pagetable.c | 97 +++++++++++++++++++ 4 files changed, 119 insertions(+), 111 deletions(-)
And if you do what was suggested do we need this patch at all? Just use the normal access sequence:
iommufd_access_create(ops=NULL) iommufd_access_attach(viommu->hwpt->ioas) iommufd_access_pin_pages()
And store a viommu->access pointer to undo it all.
I found the entire ictx would be locked by iommufd_access_create(), then the release fop couldn't even get invoked to destroy objects.
I added a new flag to address this: ----------------------------------------------------------------- diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index f25e272ae378c..a3e0ace583a66 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -1085,7 +1085,8 @@ void iommufd_access_destroy_object(struct iommufd_object *obj) if (access->ioas) WARN_ON(iommufd_access_change_ioas(access, NULL)); mutex_unlock(&access->ioas_lock); - iommufd_ctx_put(access->ictx); + if (!access->ops->internal_use) + iommufd_ctx_put(access->ictx); }
/** @@ -1126,7 +1127,8 @@ iommufd_access_create(struct iommufd_ctx *ictx, /* The calling driver is a user until iommufd_access_destroy() */ refcount_inc(&access->obj.users); access->ictx = ictx; - iommufd_ctx_get(ictx); + if (!ops->internal_use) + iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); *id = access->obj.id; mutex_init(&access->ioas_lock); -----------------------------------------------------------------
Btw, I think we can have an ops but only set unmap to NULL: static const struct iommufd_access_ops hw_queue_access_ops = { .needs_pin_pages = 1, + .internal_use = 1, /* NULL unmap to reject IOMMUFD_CMD_IOAS_UNMAP */ };
Having two flags makes the code slightly more readable. After all, HW queue does need to pin pages.
Thanks Nicolin