On Thu, May 15, 2025 at 01:06:20PM -0300, Jason Gunthorpe wrote:
On Thu, May 08, 2025 at 08:02:32PM -0700, Nicolin Chen wrote:
+/**
- struct iommu_hw_queue_alloc - ioctl(IOMMU_HW_QUEUE_ALLOC)
- @size: sizeof(struct iommu_hw_queue_alloc)
- @flags: Must be 0
- @viommu_id: Virtual IOMMU ID to associate the HW queue with
- @type: One of enum iommu_hw_queue_type
- @index: The logical index to the HW queue per virtual IOMMU for a multi-queue
model
- @out_hw_queue_id: The ID of the new HW queue
- @base_addr: Base address of the queue memory in guest physical address space
- @length: Length of the queue memory in the guest physical address space
- Allocate a HW queue object for a vIOMMU-specific HW-accelerated queue, which
- allows HW to access a guest queue memory described by @base_addr and @length.
- Upon success, the underlying physical pages of the guest queue memory will be
- pinned to prevent VMM from unmapping them in the IOAS until the HW queue gets
- destroyed.
Do we have way to make the pinning optional?
As I understand AMD's system the iommu HW itself translates the base_addr through the S2 page table automatically, so it doesn't need pinned memory and physical addresses but just the IOVA.
Right. That's why we invented a flag, and it should be probably extended to cover the pin step as well.
Perhaps for this reason the pinning should be done with a function call from the driver?
But the whole point of doing in the core was to avoid the entire iopt related structure/function from being exposed to the driver, which would otherwise hugely increase the size of the driver.o?
+struct iommu_hw_queue_alloc {
- __u32 size;
- __u32 flags;
- __u32 viommu_id;
- __u32 type;
- __u32 index;
- __u32 out_hw_queue_id;
- __aligned_u64 base_addr;
base addr should probably be called nesting_parent_iova to match how we described the viommu hwpt:
- @hwpt_id: ID of a nesting parent HWPT to associate to
Ack.
- /*
* The underlying physical pages must be pinned to prevent them from
* being unmapped (via IOMMUFD_CMD_IOAS_UNMAP) during the life cycle
* of the HW QUEUE object.
*/
- rc = iopt_pin_pages(&hwpt->ioas->iopt, cmd->base_addr, cmd->length,
pages, 0);
I don't think this actually works like this without an unmap callback. unmap will break:
iommufd_access_notify_unmap(iopt, area_first, length); /* Something is not responding to unmap requests. */ tries++; if (WARN_ON(tries > 100)) return -EDEADLOCK;
If it can't shoot down the pinning.
Hmm, I thought we want the unmap to fail until VMM releases the HW QUEUE first? In what case, does VMM wants to unmap while holding the queue pages?
Why did this need to change away from just a normal access? That ops and unmap callback are not optional things.
What vcmdq would do in the unmap callback is question I'm not sure of..
This is more reason to put the pin/access in the driver so it can provide an unmap callback that can fix it up.
As there are two types of "access" here.. you mean iommufd_access, i.e. vcmdq driver should hold an iommufd_access like an emulated vfio device driver?
I think this should have been done just by using the normal access mechanism, maybe with a simplifying wrapper for in-driver use. ie no need for patch #9
Still, the driver.o file will be very large, unlike VFIO that just depends on CONFIG_IOMMUFD?
Thanks Nicolin