On Thu, May 15, 2025 at 06:30:27AM +0000, Tian, Kevin wrote:
From: Nicolin Chen nicolinc@nvidia.com Sent: Friday, May 9, 2025 11:03 AM
+/**
- struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC)
- @size: sizeof(struct iommu_hw_queue_alloc)
- @flags: Must be 0
- @viommu_id: Virtual IOMMU ID to associate the HW queue with
- @type: One of enum iommu_hw_queue_type
- @index: The logical index to the HW queue per virtual IOMMU for a
multi-queue
model
I'm thinking of an alternative way w/o having the user to assign index and allowing the driver to poke object dependency (next patch).
Let's say the index is internally assigned by the driver. so this cmd is just for allowing a hw queue and it's the driver to decide the allocation policy, e.g. in ascending order.
Introduce a new flag in viommu_ops to indicate to core that the new hw queue should hold a reference to the previous hw queue.
core maintains a last_queue field in viommu. Upon success return from @hw_queue_alloc() the core increments the users refcnt of last_queue, records the dependency in iommufd_hw_queue struct, and update viommu->last_queue.
Then the destroy order is naturally guaranteed.
I have thought about that too. It's nice that the core can easily maintain the dependency for the driver.
But there would still need an out_index to mark each dynamically allocated queue. So VMM would know where it should map the queue.
For example, if VMM wants to allocate a queue at its own index=1 without allocating index=0 first, kernel cannot fail that as VMM doesn't provide the index. The only way left for kernel would be to output the allocated queue with index=0 and then wish VMM can validate it, which doesn't sound safe..
- @out_hw_queue_id: The ID of the new HW queue
- @base_addr: Base address of the queue memory in guest physical
address space
- @length: Length of the queue memory in the guest physical address
space
length is agnostic to address space.
Ack.
* @length: Length of the queue memory
+int iommufd_hw_queue_alloc_ioctl(struct iommufd_ucmd *ucmd) +{
- struct iommu_hw_queue_alloc *cmd = ucmd->cmd;
- struct iommufd_hw_queue *hw_queue;
- struct iommufd_hwpt_paging *hwpt;
- struct iommufd_viommu *viommu;
- struct page **pages;
- int max_npages, i;
- u64 end;
- int rc;
- if (cmd->flags || cmd->type == IOMMU_HW_QUEUE_TYPE_DEFAULT)
return -EOPNOTSUPP;
- if (!cmd->base_addr || !cmd->length)
return -EINVAL;
- if (check_add_overflow(cmd->base_addr, cmd->length - 1, &end))
return -EOVERFLOW;
- max_npages = DIV_ROUND_UP(cmd->length, PAGE_SIZE);
what about [base_addr, base_addr+length) spanning two pages but 'length' is smaller than the size of one page?
Ah, right! Probably should be something like:
offset = cmd->base_addr - PAGE_ALIGN(cmd->base_addr); max_npages = DIV_ROUND_UP(offset + cmd->length, PAGE_SIZE);
- pages = kcalloc(max_npages, sizeof(*pages), GFP_KERNEL);
- if (!pages)
return -ENOMEM;
this could be moved to right before iopt_pin_pages().
Ack.
- viommu = iommufd_get_viommu(ucmd, cmd->viommu_id);
- if (IS_ERR(viommu)) {
rc = PTR_ERR(viommu);
goto out_free;
- }
- hwpt = viommu->hwpt;
- if (!viommu->ops || !viommu->ops->hw_queue_alloc) {
rc = -EOPNOTSUPP;
goto out_put_viommu;
- }
- /* Quick test on the base address */
- if (!iommu_iova_to_phys(hwpt->common.domain, cmd->base_addr))
{
rc = -ENXIO;
goto out_put_viommu;
- }
this check is redundant. Actually it's not future proof, assuming that S2 must be pinned before the user attempts to call this cmd. But what if one day iommufd supports unpinned S2 (if a device is 100% PRI faultable) then this path will be broken.
OK. Let's drop it.
- hw_queue = viommu->ops->hw_queue_alloc(viommu, cmd->type,
cmd->index,
cmd->base_addr, cmd->length);
- if (IS_ERR(hw_queue)) {
rc = PTR_ERR(hw_queue);
goto out_unpin;
- }
- hw_queue->viommu = viommu;
- refcount_inc(&viommu->obj.users);
- hw_queue->ictx = ucmd->ictx;
viommu/ictx are already set by iommufd_hw_queue_alloc().
OK. We'd need to be careful if someday there is a core-allocated hw_queue that doesn't call iommufd_hw_queue_alloc(). Maybe I can put a note here.
Thanks Nicolin